(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 

International Bureau 

(43) International Publication Date 
18 April 2002 (18.04.2002) 




PCT 



iiiiiim 

(10) International Publication Number 

WO 02/30268 A2 



(51) International Patent Classification 7 



A61B 



(21) International Application Number: PCT/US0 1/32045 

(22) International Filing Date: 12 October 2001 (12.10.2001) 



(25) Filing Language: 

(26) Publication Language: 



English 



English 



(30) Priority Data: 

09/687,576 
09/733,742 
09/733,288 
60/263,957 
60/276,888 
60/276,791 
60/281,922 
60/286,214 
09/847,046 
60/288,589 



13 October 2000 (13.10.2000) US 

8 December 2000 (08.12.2000) US 

8 December 2000 (08.12.2000) US 

24 January 2001 (24.01.2001) US 

16 March 2001 (16.03.2001) US 

16 March 2001 (16.03.2001) US 

6 April 2001 (06.04.2001) US 

24 April 2001 (24.04.2001) US 

30 April 2001 (30.04.2001) US 

4 May 2001 (04.05.2001) US 



(71) Applicant: EOS BIOTECHNOLOGY, INC. [US/US]; 
225 A Gateway Boulevard, South San Francisco, CA 
94080-7019 (US). 

(72) Inventors: GISH, Kurt, C; 40 Perego Terrace #2, 
San Francisco, CA 94131 (US). MACK, David, H.; 
2076 Monterey Avenue, Menlo Park, CA 94025 (US). 
WILSON, Keith, E.; 219 Jeter Street, Redwood City, 
CA 94062 (US). AFAR, Daniel; 435 Visitacion Avenue, 



Brisbane, CA 94005 (US). HEVEZI, Peter; 1360 11th 
Avenue, San Francisco, CA 94122 (US). 

(74) Agents: BASTIAN, Kevin, L. et al.; Townsend and 
Townsend and Crew LLP, Two Embarcadero Center, 8th 
Floor, San Francisco, CA 9411 1-3834 (US). 

(81) Designated States (national): AE, AG, AL, AM, AT, AU, 
AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CO, CR, CU, 
CZ, DE, DK, DM, DZ, EC, EE, ES, FT, GB, GD, GE, GH, 
GM, HR, HU, ID, IL t IN, IS, JP, KE, KG, KP, KR, KZ, LC, 
LK, LR, LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, 
MX, MZ, NO, NZ, PH, PL, PT, RO, RU, SD, SE, SG, SI, 
SK, SL, TJ, TM, TR, IT, TZ, UA, UG, UZ, VN, YU, ZA, 
ZW. 

(84) Designated States (regional)*. ARIPO patent (GH, GM, 
KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW) f Eurasian 
patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European 
patent (AT, BE, CH, CY, DE, DK, ES, FT, FR, GB, GR, IE, 
IT, LU, MC, NL, PT, SE, TR), OAPI patent (BF, BJ, CF, 
CG, CI, CM, GA, GN, GQ, GW, ML, MR, NE, SN, TD, 
TG). 

Published: 

— without international search report and to be republished 
upon receipt of thai report 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



< 



fN 

O (54) Title: METHODS OF DIAGNOSIS OF PROSTATE CANCER, COMPOSITIONS AND METHODS OF SCREENING FOR 
£^ MODULATORS OF PROSTATE CANCER 

(57) Abstract: Described herein are genes whose expression are up-regulated or down-regulated in prostate cancer. Also described 
Q are such genes whose expression is further up-regulated or down-regulated in drug-resistant prostate cancer cells. Related methods 
and compositions that can be used for diagnosis and treatment of prostate cancer are disclosed. Also described herein are methods 
that can be used to identify modulators of prostate cancer. 



BEST AVAILABLE COPY 



WO 02/30268 PCT/US01/32045 



METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001; USSN 09/733,742, filed December 8, 2000; USSN 
10 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which are 
incorporated herein by reference in their entirety. 
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15 HELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer, and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. The invention further relates to methods for 
20 identifying and using agents and/or targets that inhibit prostate cancer. 

BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S., resulting in approximately 
25 40,000 deaths each year ( Landis et al, CA Cancer J. Clin. 48:6-29 (1998); Greenlee et al, 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol. 
7(7):254-257 (2000); Majeed et al., BJUInt. 85(9): 1058- 1062 (2000)). It develops as the 
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result of a pathologic transformation of normal prostate cells. In tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al., 
Cancer Control 5(6):522-531 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment. Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over 50 years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currently considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Rago, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 
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SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient. 

In one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. In another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 In one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent 

label. 

In one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. In another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

In one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 In one embodiment, the prostate cancer associated transcript is mRNA. 
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In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment. 

Additionally, provided herein is a method of evaluating the effect of a 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient. The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid. 

In one aspect, the present invention provides an isolated polypeptide which is 
encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

In another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment. In another 
embodiment, the antibody is humanized, 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

In another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

In another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 
effect of the compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect. 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant. 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 
In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 
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polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound. In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

Li one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

In another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient. 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 

5 cancer drug comprising administering the drug to a transgenic animal expressing or 

over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 

10 Preferably, at least two nucleic acid segments are included. More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 

15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 

20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof. In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 

25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
agonist or an antagonist. Preferably, the agent inhibits prostate cancer. 

30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16, 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. In another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et al., 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and Franklin, 

5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBH1. 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NP_003298)> 
a putative calcium channel highly expressed in brain (Nagamine et al., Genomics 54:124-131 
(1998)). Tip is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et al., Cancer Res. 58:1515-1520 (1998)), andMTRl, a gene locallized to within the 
Beckwith-Wiedemann syndromeAVilm's tumor susceptability region (Prawitt et al M Hum. 

15 Mol. Genet. 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et al., Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. OncoL 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder and Engel, Immunol Today 15:450-454 (1994)). 
Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, are also be useful as targets for 

25 cytotoxic T-lymphocytes, Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Int. J. Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBHl-specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,05 1,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene ciuster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 



Definitions 

The term "prostate cancer protein" or "prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologues that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

15 cluster of Tables 1-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide," 
include both naturally occurring or recombinant forms. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocurring 
protein. The "full length" may be prior to, or after, various stages of post-translation 
processing or splicing, including alternative splicing. 
5 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues, A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity " in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical." This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 

11 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (198 1), by the homology alignment algorithm of Needleman & Wunsch, J. MoL 
Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. 
Acad. ScL USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al, eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. MoL Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 

12 
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the invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al 9 supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad Set USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Pfoc. Nat'l. Acad. Set USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 

5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUP AC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid. One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typically 
conservative substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 e.g., Alberts et al , Molecular Biology of the Cell (3 rd ed„ 1994) and Cantor & Schimmel, 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of P-sheet and a-helices. 'Tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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"Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of the present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications inAntisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et al., Tetrahedron 49(10): 1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 
(1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 110:4470 (1988); and Pauwels et al., Chemica 
Scripta 26:141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14:1895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incorporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al, Proc. Natl. Acad Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et al. f Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al., J. Am. 
5 Chem. Soc. 1 10:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13:1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett 4:395 (1994); Jeffs et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins et al., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA, As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 

5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
{e.g., as commonly used in an EUSA), biotin, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 1251 In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al. f Nature , 144:945 (1962); 
David et al., Biochemistry . 13:1014 (1974); Pain et al., J. Immunol. Meth. . 40:219 (1981); 

20 and Nygren, J. Histochem. and Cvtochem. . 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not normally found in nature. In this manner, 

operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 

10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof, can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELIS A immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein {see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
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Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T ra , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background^ preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et al (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 
background. Those of ordinary skill will readily recognize that alternative hybridization and 

5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et al 

The phrase "functional effects" in the context of assays for testing compounds 
that modulate activity of a prostate cancer protein includes the determination of a parameter 

10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 
e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 
It includes ligand binding activity; cell growth on soft agar; anchorage dependence; contact 
inhibition and density limitation of growth; cellular proliferation; cellular transformation; 
growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 

15 tumor growth and metastasis in vivo; mRNA and protein expression in cells undergoing 

metastasis, and other characteristics of prostate cancer cells, functional effects" include in 
vitro, in vivo, and ex vivo activities. 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo\ mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, p-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

"Inhibitors 1 ', "activators", and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1; 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 (3 rd ed. 1994). 

'Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or "transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rf ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 

30 Fundamental Immunology. 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 

5 responsible for antigen recognition. The terms variable light chain (Vl) and variable heavy 
chain (Vh) refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a 

10 dimer of Fab which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)' 2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al, Nature 

20 348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et al, Immunology Today 4:72 (1983); Cole et al., pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens {see, e.g., McCafferty et al, Nature 348:552-554 
(1990); Marks et a/., Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 



Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

♦ 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this information in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient. Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid.or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 
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For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., nonnal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

iegulated in prostate cancer; that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g., Benson, DA, et aL % Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer; that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue {see, e.g., Tables 8, 12 and 14). "Down- 
regulation" as used herein often means at least about a 1.5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred. 

5 

Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 Mechanism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 
(June 1 1-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in the database is acquired, e.g., 
using array analysis either singly or in a library format. The database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 

5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a preferred embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. US. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et ah, Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Durbin et al, eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
al, eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist* s Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 

33 



WO 02/30268 



PCT/US01/32045 



the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 

5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFTT) and/or the comparison may 

10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Linux, SunOS, Solaris, ATX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 

15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 

20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 

25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 

30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
5 from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

20 that described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. ; 

25 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes (see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 
5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed. In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St. Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et al, Nuc. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et al, Proteins 28:405-420 (1997); Bateman et 
al, Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et al, Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 

5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted {see, e.g. PSORT web site http://psort.nibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. 1L-1 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

25 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
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bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 

5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically permeablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

15 In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 
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homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 

5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et al, supra. Much can be done by 

10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 
http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target. The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PGR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art. As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
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amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifiinctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalently coated with streptavidin, resulting in attachment 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/25 1 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PCR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA. Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et al, PCR Protocols, A Guide to Methods and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression, 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 

5 dye and a 3* quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3* end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et al, Science 
241:1077 (1988), and Barringer et al 9 Gene 89:117 (1990)), transcription amplification 
(Kwoh et aL, Proc. Natl Acad Sci. USA 86:1173 (1989)), self-sustained sequence replication 

15 (Guatelli et al, Proc. Nat. Acad. Sci. USA 87:1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 

20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g., Ausubel, supra, and Gene Expression Systems (Fernandez & Hoeffler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 

25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translational regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 

30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 
5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

Li a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCTAJS97/01019 and PCTAJS97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3* to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenlyation signals 
include those derived form SV40. 

The methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 
into nuclei. 

10 In a preferred embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells. 
Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 

5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorphs 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 

10 techniques well known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 

15 acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 

20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 

25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant 
prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 
5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PCR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques. Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 - although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
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insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 

5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with bifunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as S^^dithiobisCsuccinimidylpropionate), bifunctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as methyl-3-((p-azidophenyl)dithio)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
terminal amine, and amidation of any C-tenninal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 
Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
5 et al, Arch. Biocherru Biophys., 259:52 (1987) and by Edge et al y Anal Biocherru, 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al, Metk 
EnzymoL, 138:350 (1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HES6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
al, Mol Cell Biol 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et al, Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky et al, 
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Protein Engineering 3(6):547-553 (1990)). Other tag polypeptides include the Flag-peptide 
(Hopp etal, BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et at, 
Science 255:192-194 (1992)); tubulin epitope peptide (Skinner et al, J. Biol Chem. 
266:15163-15166 (1991)); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth et aL, 

5 Proa Natl Acad. Set USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment; the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
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may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 

5 adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies; Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed. The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies, Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 

53 



WO 02/30268 



PCT/US01/32045 



protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 In a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et al, Nature 321:522-525 (1986); Riechmann et al. y Nature 

332:323-329 (1988); andPresta, Curr, Op. Struct. Biol 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al, 
Nature 321:522-525 (1986); Riechmann et al, Nature 332:323-327 (1988); Verhoeyen et al, 
Science 239: 1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, J. Mol Biol 227:381 (1991); 

10 Marks et al, /. Mol Biol 222:581 (1991)). The techniques of Cole et al and Boerner et al 
are also available for the preparation of human monoclonal antibodies (Cole et al, 
Monoclonal Antibodies and Cancer Therapy, p. 77 (1985) and Boerner etal, J. Immunol 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al, Bio/Technology 10:779- 

20 783 (1992); Lonberg et al, Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Fishwild et al, Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein, immunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 

5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of the transmembrane prostate cancer protein. In 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-p, ILrl, INF-y 
and 11^2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 

5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

15 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a Kd of at least about 0. 1 mM, more usually at least about 1 j*M, preferably at least about 
0.1 nM or better, and most preferably, 0.01 |iM or better. Selectivity of binding is also 
important. 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 
5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 

allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14:1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
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Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 

5 cancer protein and standard immunoassays (ELISAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specific&lly bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 

5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or coiresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 

10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 In another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluorometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELISA, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
15 and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 
30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 
5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokarnik, et al 7 Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

15 profile genes after treatment with a candidate agent, see Zlokarnik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. "Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
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immunoassays, Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
15 that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint. In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, 

5 preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 

10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
15 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chetnical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop et al, J. Med. Chem. 37(9): 1233-1251 (1994)). 

Preparation and screening of combinatorial chemical libraries is well known to 

5 those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487-493 
(1991), Houghton et aL, Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et al, Proc. Nat. Acad. Set USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al., /. Amer. Chem. Soc. 
114:6568 (1992)), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding 
(Hirschmann etaU /. Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen et al, J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocarbamates (Cho, et al, Science 261: 1303 (1993)), and/or peptidyl phosphonates 

(Campbell et al, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et al., J. Med. Chem. 
37:1385 (1994), nucleic acid libraries (see, e.g. t Strategene, Corp.), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn et al, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries (see, 

20 e.g., Iiang et al., Science 274:1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries (see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
with the present invention. The nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available {see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
{see, e.g. f Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention, 

5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 

10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, i.e n a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen (Cancer Res. 48:2659 (1988 and van der Krol et al. 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes (see, e.g., Castanotto et al y Adv. in 
Pharmacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
Nucl. Acids Res. 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et al, Proc. Natl Acad. Sci. USA 90:6340-6344 (1993); Yamada et 

10 al, Human Gene Therapy 1:39-45 (1994); Leavitt et al, Proc. Natl Acad. Sci. USA 92:699- 
703 (1995); Leavitt et al, Human Gene Therapy 5: 1 151-120 (1994); and Yamada et al, 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448, It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 
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with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FTTC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 

5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

5.594.117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 

5.594.118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 

5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 
' 20 and/or modulate the biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic the expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have an associated prostate cancer expression profile. By 

5 "administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 

10 the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 

15 generated, as outlined herein. 

Thus, e.g., prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 

20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

25 * differentially expressed gene as important in a particular state, screening of modulators of 

either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 

30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment. In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 

5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment. Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BS A. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above, A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth or pH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, EUS A and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluorescently or radioactiveiy labeled nucleic acids, 

5 radioactiveiy or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or p-gal. The reporter construct is typically transfected into a cell After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 

10 activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 
expression of the gene or the gene product itself can be done. The gene products of 

15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 

20 embodiment, screens arc designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 

25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 

30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 

5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape, 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BSA), 
casein or other innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., l25 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand, etc. Under certain circumstances, there may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement. 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 
10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 
20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 

5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided. The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided. The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided. The method comprises administration 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft. 

Techniques for soft agar growth or colony formation in suspension assays are 
10 described in Freshney, Culture of Animal Cells a Manual of Basic Technique (3 M ecL, 1994), 
herein incorporated by reference. See also, the methods section of Garkavtsev et al (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 

saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts (see, e.g., Temin, J. Natl Cancer Insti. 37:167-175 (1966); Eagle et al, J. Exp. 
Med, 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
10 specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
15 counterparts. See, e.g., Folkman, Angiogenesis and Cancer, Sem Cancer Biol. (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et al , /. Biol Chem. 249:4295-4305 (1974); 
Strickland & Beers, /. Biol Chem. 251:5694-5702 (1976); Whur et al, Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
20 tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985); 
Freshney Anticancer Res. 5:111-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel -or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent. Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted, Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion (see, e.g., Capecchi et al, Science 244:1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach, Robertson, ed., IRL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella et al., J. 
Natl Cancer Inst. 52:921 (1974)), a SCID mouse, a thymectomized mouse, or an irradiated 
mouse (see, e.g., Bradley et al, Br. J. Cancer 38:263 (1978); Selby et aL, Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not. In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Student's T test) are said to have inhibited growth. 



S Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e., a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist. This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus . 



82 



WO 02/30268 



PCT/US01/32045 



Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient. By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered. The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel et a/., Pharmaceutical Dosage Forms and Drug 
Delivery; Lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 

10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art. U.S. Patent Application N. 09/687,576, further discloses the use of 

15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 

20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 

25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient. In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 

30 pharmaceutical^ acceptable salts, which is meant to include both acid and base addition 
salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain the 
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biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 

5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 

methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

15 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 

30 aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 
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be sterilized by conventional, well known sterilization techniques. The compositions may 
contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 

5 sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington's Pharmaceutical Science (15th ed., 1980) and Goodman & Gillman, 
The Pharmacologial Basis of Therapeutics (Hardman et al.,eds., 1996)). 

10 Thus, a typical pharmaceutical composition for intravenous administration 

would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 

15 preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 

20 compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 

25 depending on the dosage and frequency as required and tolerated by the patient In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 

30 condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
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treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, 

20 e.g., Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel etal., eds., Current Protocols (supplemented through 1999), 
and Sambrook et <d., Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
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compositions can include, e.g., lipidated peptides (see, e,g.,Vitiello, A. et al., J. Clin. Invest 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g., Eldridge, et al.,Molec. Immunol 28:287-294, (1991); Alonso et al, 
Vaccine 12:299-306 (1994); Jones etal, Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et ah, Nature 
344:873-875 (1990); Hu et al., Clin Exp Immunol. 113:235-243 (1998)), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl Acad. Sci. U.SA. 85:5409-5413 (1988); 
Tarn, /. Immunol. Metliods 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et al., In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al, Nature 320:535 (1986); Hu et al, Nature 320:537 (1986); Kieny, et al., 
AIDS Bio/Technology 4:790 (1986); Top et al., J. Infect. Dis. 124: 148 (1971); Chanda et al, 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g., Kofler et al, J. 
Immunol Methods. 192:25 (1996); Eldridge et al, Sent. Hematol. 30:16 (1993); Falo et al, 

15 Nature Med. 7:649 (1995)), adjuvants (Warren et al, Annu. Rev. Immunol 4:369 (1986); 
Gupta et al., Vaccine 11:293 (1993)), liposomes (Reddy et al, J. Immunol. 148:1585 (1992); 
Rock, Immunol Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, etal, 
Science 259: 1745 (1993); Robinson et al, Vaccine 1 1:957 (1993); Shiver et al, In: Concepts 
in vaccine development (Kaufmann, ed., p. 423, 1996); Cease & Berzofsky, Annu. Rev. 

20 Immunol 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16 (1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
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polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 
5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et. aL f Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et aU Nature 351:456-460 (1991), A wide variety of other vectors useful 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et al„ 
Mol Med Today 6:66-71 (2000); Shedlock et al, J Leukoc Biol 68:793-806 (2000); Hipp et 

25 al, In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
The prostate cancer gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 

5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpress the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 
30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 

5 include sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 

10 contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
15 prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein, A wide variety of kits and components can be 
20 prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 

25 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated. The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g., Polytron 3100). The size of 
the generator/probe used depends upon the sample amount. A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. A 
10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled. If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 
15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifuged at 6500 rpm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube. The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples are 
then incubated at room temp, for 10 minutes an centrifuged at 6500 rpm in Sorvall for 20 
min. at 4°C. 
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The RNA is then washed. The supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm (<7500 x g) for 5 
5 minutes at 4°C. 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet. The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC H 2 0. The absorbance is then measured. 

The poly A+ mRNA may next be purified from total RNA by other methods 
such as Qiagen' s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 
15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 

incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet. A little bit 
20 of solution can be left behind to reduce the loss of Oligotex, The supernatant is saved until 
satisfactory binding and elution of poly A + mRNA has been found. 

Then, the preparation is gentiy resuspended in Wash Buffer OW2 and pipetted 
onto the spin column and centrifuged at full speed (soft setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 
of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifugation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 



92 



WO 02/30268 



PCTYUS01/32045 



The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -20°C 1 hour to overnight (or 20-30 min. at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. 
10 The80% ethanol wash is then repeated. The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified using other methods (e.g., Oiagen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-free water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. The preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000 rpm. The flowthrough is. discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifuging for 2 min at 
maximum speed. The column is transferred to a new 1.5-ml collection tube, 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit. The starting material is 5 ug of total RNA or 1 ug of polyA+ 
mRNAL For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0.1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul E.coli DNA 
Ligase; 4 ul 10 U/ul E.coli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 hours at 16°C. Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 

Cleaning up cDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifiiged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrif uged for 5 minutes at maximum speed. The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adcling 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 
min, maximum speed. The supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 
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In vitro Transcription (TVT) and labeling with biotin 

In vitro Transcription (IVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin- wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); 1.5 ul T7 lOxUTP (75 mM) (Ambion); 3,75 ul 10 mM Bio-ll-UTP 
(Boehringer-Mannheim/Roche or Enzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0.1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated IVT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 
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IVT antisense RNA; 4 jig: jxl 
Random Hexamers (1 ^g/|Ltl): 4 \il 
H 2 0: pi 

14 ill 

5 Incubate the above 14 \il mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 

0.1MDTT: 3 |il 

SOXdNTPmix: 0.6^1 

H 2 0: 2.4 jlU 

10 Cy3 or Cy5 dUTP (lmM): 3 \i\ 

SS RT II (BRL): 1^1 



The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 \x\ SSII is added and incubated for another hour before being placed on ice. 

The SOX dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTTP and is made by adding 25 pi each of lOOmM dATP, dCTP, and dGTP; 10 jd of 

lOOmM dTTP to 15 }i\ H 2 0. ] 

RNA degradation is performed as follows. Add 86 [xl H20, L5 jxl 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 \i\ TE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 jxl buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 

Sample preparation 

For sample preparation, add Cot-1 DNA, 10 SOX dNTPs, 1 jil; 20X SSC, 
2.3 jul; Na pyro phosphate, 7.5 jil; 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 pi H 2 0. Add 0.38 \i\ 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC+0.75ml 10% SDS in 250ml H 2 0; IX SSC: 5 min., 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMTs 
5 and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NJ) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2): 109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes that may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out. 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et al., Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Natl. Cancer Inst.91(21):1869-1876 (1999); Bubendorf et aL, J. Natl. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during the subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significantly up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 

5 to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 

10 http://www.ncbi.nlm.nih.gov/UniGene/). 



15 
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TABLE1 : shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 ; Ratio of tumor to normal body tissue 



Pkey UnigenelD ExAccn Unlngene Title R1 

131919 Hs272458 AA121266 ESTs 372 

120328 Hs29Q905 AA198979 ESTs; Weakly similar to (derUne not ava 32.6 

105201 Hs51412 AA195626 ESTs 30.1 

101486 Hs.1852 M24902 acid phosphatase; prostate 252 

119073 HS279477 R32894 ESTs 24.8 

133428 Hs.183752 M34376 microsemlnoproteln; beta- 23.8 

128180 Hs.171995 AA595348 katlikrein 3; (prostate specific antigen 21.4 

104080 Hs57771 AA402971 Homo sapiens mRNA for serine protease (T 18.9 

127537 Hs.162859 AA569531 ESTs 18.8 

131665 Hs.30343 R22139 ESTs 174 

101050 Hs.1832 K01911 neuropeptide Y 17.3 

130771 Hs.1915 N48056 folate hydrolase (prostate-specific memb 17 

108153 Hs.40808 AA054237 ESTs 16.9 

107485 Hs.262476 W63793 S-adenosylmethionine decarboxylase 1 16.7 

106155 Hs.33287 AM25309 ESTs 165 

129534 Hs.11260 R73640 ESTs 16.4 

100569 Hs.171995 HG2261-HT2351 Antigen, Prostate Specific, Ait Splice 16 

101889 Hs.181350 S39329 kaffikrein 2; prostatic 154 

135389 Hs.99872 U05237 fetal Alzheimer antigen 15 

101506 Hs.62192 M27436 coagulation factor ill (thromboplastin; 135 

134374 Hs.8236 D62633 ESTs 12.7 

133944 Hs.7780 AA045870 ESTs 125 

109141 Hs.193380 AA176428 ESTs 125 

130974 Hs2178 X57985 H2B histone family; member Q 11.8 

114768 Hs.182339 AA1490O7 ESTs 11.8 

104394 Hs.172129 H46617 yp19h1.r1 Soares breast 3NbHBst Homo sap 115 

125299 Hs.102720 Z39436 ESTs 11.6 

104660 Hs.14846 AA007160 ESTs 11.4 

100116 Hs.76045 D00654 adln; gamma 2; smooth muscle; enteric 11 

131061 HS268744 N64328 ESTs; Moderately similar to KIAA0273[H. 10.9 

126645 126645 AI167942 Homo sapiens BAC done RQ041D11 from 7q2 10.7 

135153 Hs.95420 N40141 Homo sapiens mRNA for JM27 protein; comp 10.6 

107033 Hs.1 13314 AA599629 ESTs * 10.6 

118417 N66048 ESTs; Weakly similar to polymerase [H.sa 105 

126758 HS293960 W37145 ESTs 102 

115674 Hs.8364 AA406542 ESTs 10.1 

134989 Hs.92381 AA236324 ESTs; Weakly similar to illi ALU CLASS A 10.1 

107102 Hs.30652 AA609723 ESTs 10.1 

116787 Hs.15641 H28581 ESTs 10.1 

115719 HS59622 AA416997 ESTs 10 

123209 Hs.203270 AA489711 ESTs 9.9 

101664 Hs.121017 M60752 H2A histone family; member A 95 

112971 Hs.83883 T17185 ESTs 9.7 

102519 Hs.80296 U52969 Purkinje cell protein 4 9.7 

117984 H3.106778 N51919 ESTs 9.7 

105840 Hs22209 AA398533 ESTs 9.4 

129523 Hs274509 M30894 T-cell receptor; gamma cluster 9.4 

132964 Hs.187133 AA031360 ESTs 92 

121853 Hs.98502 AA425887 ESTs 9 
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115764 Hs.91011 AM21562 anterior gradient 2 (Xenopus laevis; sec 8.9 

119617 Hs.55999 W4738Q ESTs 8.9 

100552 Hs.301946 HG2167-HT2237 Protein Kinase Ht31, Camp-Dependent 8.9 

105627 Hs23317 AA281245 ESTs 8.8 

5 101461 Hs.76422 M22430 phospholipase A2; group IIA (platelets; 8,7 

131725 Hs.31146 AA456264 ESTs; Highly similar to (defiine not ava 8.5 

124526 Hs293185 N62098 yz61c5.s1 Soaresjnuttiple.scieroslsJNbH 8.5 

118528 Hs.49397 N67889 ESTs 8.2 

133845 Hs.76704 T68510 ESTs 8.2 

10 133354 Hs.334762 AA055552 ESTs; Weakly similar to KIAA0319 [Ksapi 8.1 

105912 Hs.20415 AA402000 ESTs; Weakly similar to QS3786 [H.sapien 8 

118018 HS278695 N95796 ESTs 8 

100394 Hs.66052 D84276 CD38 antigen (p45) 8 

114132 Hs24192 Z38688 ESTs 7.9 

15 116786 Hs.301527 H25836 tumor necrosis factor (ligand) superfami 7.7 

106579 Hs23G23 AA456135 ESTs 7.6 

128790 Hs.105700 AA291725 secreted frizzled-related protein 4 7.5 

114985 Hs.72472 AA250737 ESTs 7 A 

112033 Hs22627 R43162 ESTs 7.1 

20 102398 U42359 Human N33 protein form 1 (N33) gene, exo 7 

101201 Hs2256 L22524 matrix metalloproteinase 7 (matrilysin; 6.9 

109272 Hs288462 AA195718 ESTs 6.9 

103145 Hs.169849 X66276 myosln-btnding protein C; slow-type 6.9 

101803 Hs.155691 M86546 pre-B-ceD leukemia transcription factor 6.8 

25 120562 Hs.302267 AA280036 ESTs; Weakly similar to W01 A6.c [Celega 6.8 

109112 Hs257924 AA169379 ESTs 6.8 

109795 Hs.326416 F10707 ESTs 6.7 

107532 Hs.173684 Z19643 ESTs; Weakly similar to (defiine not ava 6.7 

130336 Hs.171995 X07730 kaliikrein 3; (prostate specific antigen 6.6 

30 131425 Hs26691 AA219134 ESTs 6.6 

120588 Hs.16193 AA281591 Homo sapiens mRNA; cDNA DKFZp586B211 (fr 6.6 

132902 Hs.59838 AA490969 ESTs 6.6 

125674 Hs.323378 W28078 H^aplens mRNA for transmembrane protein 6.6 

133724 Hs.75746 U07919 aldehyde dehydrogenase 6 6.5 

35 130343 Hs.278628 AA490262 ESTs; Moderately similar to APXL gene pr 6.5 

120215 Hs.108787 Z41050 Homo sapiens Mcd4p homolog mRNA; complet 6.5 

129215 Hs.126085 AA176867 ESTs 6.5 

131881 Hs.3383 AA010163 upstream regulator element binding prot 6.5 

133376 Hs.7232 T23670 ESTs 6.4 

40 105376 Hs.8768 AA236559 ESTs; Weakly similar to neuronal thread 6.4 

104674 Hs26289 AA009527 ESTs 6.4 

100727 Hs.334788 X07290 Human HF.12 gene mRNA 6.3 

130150 Hs.15113 AF000573 homogentfeate12-<r*oxygenase(hornogenti 6.3 

121770 Hs278428 AA421714 Homo sapiens mRNA for KIAA0898 protein; 6.3 

45 123475 Hs25G528 AA599267 ESTs; Weakly similar to ANKYRlN; BRAIN V 6.3 

133061 Hs296638 AB000584 prostate differentiation factor 6.3 

116429 Hs279923 AA609710 ESTs; Weakly similar to similar to GTP-b 62 

101233 Hs.878 L29003 sorbitol dehydrogenase 62 

104691 Hs.37744 AA011176 ESTs 62 

50 127248 AA325029 EST27953 Cerebellum II Homo sapiens cDNA 62 

127775 Hs.1 79902 H04106 ESTs; Weakly similar to (defiine not ava 62 

105500 HS222399 AA256485 ESTs 6.1 

131463 Hs2714 X74142 forkhead (Drosophila)-like 1 - 6.1 

132116 Hs.40289 AA234767 ESTs 6 

55 130828 HS203213 AA053400 ESTs 5.9 

115357 Hs.72988 AA281793 ESTs 5.8 

105496 Hs.301997 AA256323 ESTs 5.7 

116334 Hs.48948 AA491457 ESTs 5.7 

107968 Hs.61539 AA034020 ESTs 5.7 

60 120132 Hs.125019 Z38839 ESTs; Weakly similar to l!!l ALU SUBFAMI 5.6 

106375 Hs289072 AA443993 ESTs 5.6 

132550 Hs.1 70195 AA029597 bone morphogsnetic protein 7 (osteogenic 5.6 

124777 Hs.140237 R41933 ESTs; Weakly similar to neuronal thread 5.6 

100311 Hs.337616 D50640 phosphodiesterase 3B; cQMP-inhtoited 5.6 

65 101791 Hs.62354 M83822 Human beige-like protein (BGL) mRNA; par 5.5 

117698 Hs.45107 N41002 ESTs 5.5 

132387 Hs281434 R70914 heat shock 70kD protein 1 5.5 

122041 Hs.98732 AA431407 Homo sapiens Chromosome 16 BAC clone CIT 5.5 

133723 Hs262476 AA088851 S-adenosyimethlonine decarboxylase 1 5.5 
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113938 W81598 ESTs 5.4 

133015 Hs.246315 AAD47036 ESTs 5.4 

125745 Hs.75722 AI283493 ribophorinll 5.4 

107295 Hs.80120 T34527 UDP-N-acetyVatpha-D^atectosam!ne:polyp 6.4 

5 108188 Hs.7780 AA056482 ESTs 6.3 

100184 Hs.21223 D17408 ca|ponin1; basic; smooth muscle 5.3 

104466 Hs.326392 N25110 Human guanine nucleolide exchange factor 5.3 

104033 Hs.98944 AA365031 ESTs 55 

110844 Hs.167531 N31952 ESTs;Weakty similar to (defline not ava 5.3 

10 129056 Hs.108338 H70627 ESTs; Weakly similar to WW ALU SUBFAMI 55 

102805 Hs.25351 U90304 Iroquots-ctass homeodomain protein 55 

133493 Hs.194369 AA284143 Homo sapiens chromosome 1 atrophln-1 rel 55 

129184 Hs.109201 W26769 ESTs; Highly similar to (defiine not ava 52 

134158 Hs.79428 U15174 BCL2/adenovirusE1B 19kD-interacting pro 5.2 

15 107240 Hs.159872 D59368 ESTs 52 

104787 AA027317 ESTs; Weakly similar to III! ALU SUBFAMI 52 

123527 Hs.108327 AA608679 damage-specific DNA binding protein 1 (1 52 

116646 Hs.194228 F03048 ESTs; Modsratefy similar to III! ALU SUB 52 

101448 Hs.195850 M21389 keratin 5 (epidermolysis bullosa simplex 5.1 

20 1 16188 Hs.184598 AA464728 ESTs; Weakly similar to !!!! ALU SUBFAMI 5.1 

126259 Hs.281428 Z21472 ESTs; Moderately similar to Ml ALU SUB 5.1 

105921 Hs.169119 AA402613 ESTs 6.1 

103375 Hs.54416 X91868 sine oculis homeobox (Orosophila) homoto 5.1 

128871 Hs.106778 AA4O0271 ESTs; Highly similar to (defline not ava 5.1 

25 112681 Hs.148932 R87331 ESTs; Moderately similar to semaphorin V 5.1 

105784 Hs.226434 AA350771 ESTs 5.1 

116238 Hs.47144 AA479382 ESTs 5 

102913 Hs5D342 X07693 keratin 15 5 

103011 Hs526035 X52541 early growth response 1 5 

30 126023 H58881 yr36d09.r1 Soares fetal liver spleen INF 5 

103709 Hs.13804 AA037316 ESTs 5 

118981 HS59288 N93839 ESTs; Weakly similar to 111! ALU SUBFAMI 5 

134807 Hs59732 X78932 zinc finger protein 273 5 

100079 H&23311 AB002365 Human mRNA for KIAA0367 gene; partial cd 4.9 

35 132047 Hs.3796 D83492 EphB6 4.9 

132880 Hs.177537 AA444369 ESTs 4.9 

124049 Hs.74519 F10523 primase; polypeptide 2A(58kD) 45 

133330 Hs.71119 U42360 Human N33 mRNA; complete cds 45 

104776 AA026349 ESTs 45 

40 122593 Hs.128749 AA453310 Homo sapiens alpha-methyiacyK?oA racema 45 

103912 Hs.143087 AA251078 ESTs 45 

113961 Hs.26009 W86307 Homo sapiens mRNA for KIAA0860 protein; 4.8 

105288 Hs.3585 AA233168 ESTs; Weakly similar to coded for by C, 4.8 

135035 Hs.284186 H89575 ESTs 45 

45 104144 Hs.183390 AA447439 ESTs; Weakfy similar to ZINC FINGER PROT 45 

129389 Hs.288126 AA621604 ESTs 45 

125982 R98091 RAE1 (RNA export 1 ; S.pombe) homolog 4.8 

125162 Hs.26243 W44682 ESTs 45 

103023 Hs.117950 X53793 multifunctional polypeptide similar to S 4.7 

50 129735 W80701 ESTs; Weakly similar to HERV-E envelope 4.7 

104479 Hs.106390 N36040 ESTs 4.7 

103731 AA070545 zm7c3rl Stratagene neuroepithelium (#93 4.7 

126575 Hs.127602 W72416 ESTs * 4.7 

124578 Hs231500 N68321 Human glucose transporter-like proteln-l 4.7 

55 130617 Hs.1874 M90516 glutajrane^mctose^-phospratetransamln 4.7 

116752 Hs.91622 H06373 Homo sapiens done 24456 mRNA sequence 4.7 

100279 Hs.82007 D42084 Human mRNA for KIAA0094 gene; partial cd 4.7 

126288 HS59576 AI479264 ESTs 4.7 

131836 Hs.32990 AA610086 ESTs 4.7 

60 106717 Hs.239489 AA465093 TIA1 cytotoxic granute-associated RNA-bi 4.7 
114542 Hs.91011 AA055768 ESTs 45 

103806 AA130614 zolr2,r1 Stratagene neuroepithelium NT2R 45 

130529 AA173238 small inducible cytokine A5 (RANTES) 45 

115675 HS52065 AA406546 ESTs 45 

65 111386 Hs.293798 N95326 ESTs 45 

106503 Hs.29879 AA452411 ESTs 45 
119943 Hs.14158 W86835 copinelll 4.6 

104459 Hs.100070 M91493 EST 4.6 

100774 Hs59603 HG371-HT1063 Mucin 1, Epithelial, Aft. Splice 6 4.6 
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100852 Hs.142653 HG2825-HT2949 Ret Transforming Gene 4.6 

132015 Hs.3731 D11900 ESTs 4.6 

126086 H70975 yr73g01 .rl Soares fetal liver spleen 1NF 4.6 

130888 HS.173094 F03819 ESTs 4.6 

5 1063S0 Hs20166 AA446964 Prostate stem cell antigen 4.6 

126959 AA199853 ESTs; Moderately similar to fill ALU SUB 4.5 

131564 Hs29117 X91648 Ksapiens mRNA tor pur alpha extended 3 1 4.5 

104838 Hs20953 AA039481 ESTs 4.5 

125661 R50319 ESTs 4.5 

10 103171 Hs.234726 X68733 alpha-1 -antichymotrypsln 4.5 

103928 Hs.199160 AA280085 ESTs 4.5 

102899 Hs.75730 X06272 signal recognition particle receptor (d 4.5 

100892 Hs. 180769 HG4557-HT4962 Small Nuclear Rlbonucleoproteln U1, 1snr 4.5 

106167 Hs.7956 AA425906 ESTs 4.5 

15 129404 Hs.317584 AA172G56 ESTs 4.5 

106990 Hs24758 AA521354 ESTs 4.5 

132316 Hs.44566 U28831 Hurnan protein immune-reactive with antl- 4.4 

132056 Hs.38176 T89386 Homo sapiens mRNA for KIAA0606 protein; 4.4 

133718 Hs.198760 X15306 neurofilament; heavy polypeptide (200kD) 4.4 

20 101470 Hs.1846 M22898 tumor protein p53 (U-Fraumenl syndrome) 4.4 

131904 Hs284296 AA143019 ESTs; Highly simitar to surface 4 Integr 4.4 

105804 Hs22514 AA383142 ESTs 4.4 

122861 Hs.1 19394 AA464428 ESTs 4.4 

111336 Hs29B94 N79565 ESTs 4.4 

25 121944 Hs.98518 AM29278 ESTs 4.4 

134401 Hs211577 AA243748 ESTs; Highly similar to CG1 protein [H.s 4.4 

126458 Hs288969 AA815252 ESTs; Weakly similar to lilt ALU SUBFAMI 4.4 

133435 Hs.323966 T23983 ESTs; Moderately simitar to fill ALU SUB 4.4 

105178 Hs21941 AA187490 ESTs 4.3 

30 127315 AA640834 nr27b06.r1 NCLCGAP_Pr3 Homo sapiens cDN 4.3 

132645 Hs.54424 X87870 H.saplens mRNA for hepatocyte nuclear fa 4.3 

1 16162 Hs282990 AA461487 ESTs; Weakty similar to F52C122 [Celeg 4.3 

118040 Hs.47567 N52876 EST 4.3 

130008 Hs.278427 M31423 cerebellar degeneration-related protein 4.3 

35 126607 Hs.114688 W87424 ESTs 4.3 

123061 Hs.105130 AA482030 EST 4.3 

109391 Hs.184245 AA219699 ESTs 4.3 

109175 AA180496 ESTs 4.3 

127003 Hs.173540 AA550806 ESTs; Weakly similar to (dafiine not ava 4.3 

40 102547 Hs.46638 U57911 chromosome 11 open reading frame 0 4.3 

134208 Hs.79993 U88871 peroxisomal biogenesis factor 7 4.3 

104258 Hs.5462 AF007216 solute carrier family 4; sodium blcarbon 4.3 

130759 Hs.18946 AA094720 ESTs; Weakly similar to (deflins not ava 4.3 

132160 Hs295923 AA281770 seven in absentia (Drosophila) homolog 1 4.3 

45 135062 Hs.93872 AA174183 ESTs 4.3 

126510 Hs.334762 R49702 ESTs; Weakty similar to KIAA0319 [H.sapl 42 

122055 Hs.98747 AA431732 EST 42 

133136 Hs.6574 AF007165 suppressln(nuclBar deformed epidermal a 4.2 

109890 Hs20843 H04649 ESTs 42 

50 133294 Hs.69997 R79723 H.sapiens mRNA for translin associated z 42 

134436 Hs.83190 S80437 fatty acid synthase {3* region} [human, 42 

107375 HS251064 U88573 NBR2 42 
122223 HS27413 AA436158 ESTs . 42 
103044 Hs248210 X55777 H.sapiens Mahlavu hepatocellular carcino 42 

55 120125 Hs.59815 W99362 EST 42 

128969 Hs283978 T65327 ESTs; Highly similar to (deflina not ava 4.2 

129637 Hs.1 179 D90359 TATA box binding protein (T8P)-as$ociate 42 

106566 AA455921 ESTs; Weakly similar to ill I ALU SUBFAM I 42 

112605 HS29852 R79220 ESTs 42 

60 103364 Hs279929 X90872 H.sapiens mRNA for gp25L2 protein 42 

132811 Hs.57419 U25435 transcriptional repressor 42 

126570 Hs.326292 T79274 ESTs 42 

116298 Hs.94109 AA489046 ESTs 42 

103024 Hs.105938 X53961 tactotransferrin 4.1 

65 129133 Hs.108850 R56728 yg95c8.r1 Soares infant brain 1NIB Homo 4.1 

133167 Hs.6641 N98707 kinesin family member 5C 4.1 

126871 Hs.14051 AA351779 ESTs 4.1 
132333 Hs.45032 AA192157 ESTs 4.1 

107376 Hs.327179 U90545 solute carrier family 17 (sodium phospha 4.1 
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128517 Hs.100861 AA280617 ESTs; WeaWy slrnOar to p50 katanln [H.s 4.1 

130555 Hs.1 16774 AA450324 ESTs 4.1 

105765 Hs£4183 AA343514 ESTs 4.1 

126529 Hs.26369 AA133237 ESTs 4.1 

5 125928 Hs.181889 H29730 ESTs 4.1 

117280 Hs.172129 N22107 ESTs; Moderately similar toll!! ALUSUB 4.1 

100234 Hs.3085 D29677 KIAA0054 gene product * 4.1 

100959 Hs.1 16127 J00073 aetin; alpha; cardiac muscle 4.1 

107130 Hs.12913 AA620582 ESTs; WeaWy similar to (defiine not ava 4.1 

10 105035 Hs.8859 AA128486 ESTs 4.1 

126735 Hs£26795 AA808949 glutathione S-translerase pi 4.1 

113056 Hs.8036 T26471 ESTs; Moderately similar to ALU SUB 4 

102460 Hs.211582 U48959 Homo sapiens myosin light chain kinase ( 4 

106968 Hs.26813 AA504631 ESTs; Weakly similar to (deftine not ava 4 

15 123107 Hs.104207 AA486071 ESTs 4 

127258 Hs.267987 AA327550 ESTs; Weakly similar to !!!! ALU SUBFAMI 4 

105329 Hs£2662 AA234561 ESTs 4 

115504 Hs.42736 AA291946 ESTs 4 

120726 Hs.97293 AA293656 ESTs 4 

20 103576 Hs.94560 Z26317 desmogteln2 4 

127889 Hs.144941 AI147408 ESTs 4 

106394 Hs.25320 AA447223 ESTs 4 

128046 AA873285 ESTs 4 

103391 Hs.114368 X94453 pyrroline-6-cart>oxytat9 synthetase (glut 4 

25 106448 Hs.27004 AA449455 ESTs 4 

126513 Hs.86276 W27601 ESTs; Moderately similar to (defiine not 4 

129593 Hs.98314 AA487015 ESTs; Weakly similar to !!l! ALU SUBFAW! 3.9 

110151 Hs.31608 H18838 ESTs 3.9 

105344 Hs.8645 AA235303 ESTs 3.9 

30 104791 Hs.301871 AA029046 ESTs 3.9 

123442 Hs. 111493 AA598803 ESTs 3.9 

127800 Hs.79428 AA521047 BCL2/adenovirusE1B 19kD-lnteracting pro 3.9 

114555 Hs.167904 AA058594 ESTs 3.9 

122138 Hs.163960 AA435549 ESTs 3.9 

35 129565 Hs.1 98726 X77777 vasoactive Intestinal peptide receptor 1 3.9 

103471 Hs.75216 Y00815 protein tyrosine phosphatase; receptor t 3.9 

133908 Hs.325474 M83216 caldesmonl 3.9 

105635 Hs.301985 AA281508 ESTs 3.9 

134285 Hs.81086 AM60012 solute carrier family 22 (organic cation 3.9 

40 134125 Hs.50421 R38102 KIAA0203 gene product 3.9 

125628 Hs.241493 AA418069 natural Wiler-tumor recognition sequenc 3.9 

103695 Hs.186600 AA018758 ESTs 3.9 

100642 Ks.182183 HG2743-HT3926 Caldesmonl, Alt Splice 6, Non-Muscle 3.9 

104334 Hs.78771 D82614 ESTs 3.9 

45 110242 Hs.19978 H26417 ESTs 3.9 

125298 Hs.289008 Z39255 ESTs 3.9 

104060 Hs.303193 AA397968 zt87a9.r1 SoaresJestisJiHT Homo sapiens 3.9 

105823 Hs293960 AA398197 ESTs 3.9 

126499 Hs.1 10445 AA315671 ESTs; Moderately similar to unknown (Mjn 3.9 

50 130752 Hs.18895 050927 KIAA0137 gene product 3.8 

123494 Hs,112110 AA599786 ESTs 3.8 

104846 Hs.32478 AA040154 ESTs 3.8 

108921 Hs.71721 AA142913 ESTs * 3.8 

115506 Hs.45207 AA292537 ESTs 3.8 

55 100452 Hs.241552 D87742 Human mRMA for KIAA0268 gene; partial cd 3.8 

104454 Ks.129228 M84443 galactokinase 2 3.8 

108730 Hs.102859 AA126254 ESTs 3,8 

131223 Hs.24427 AA24778B ESTs; Highly similar to (defiine not ava 3.8 

104784 Hs.269228 AA027055 ESTs 3.8 

60 104946 Hs.73848 AA069549 ESTs 3.8 

106932 Hs.9394 AA495926 ESTs 3.8 

101724 Hs.620 M69225 bullous pemphigoid antigen 1 (230/240kD) 3.8 

106140 Hs.14912 AA424524 Homo sapiens mRNA for KiAAQ288 gene; par 3.8 

128135 Hs.269721 AA913491 ESTs 3.8 

65 120030 Hs£8694 W92051 ESTs 33 

126457 Hs.50382 AA007489 zh98g04.r1 SoaresJetaUiver_,spleenJNF 3.8 

123917 Hs.1 12969 AA621311 EST 3.7 

110714 Hs.17752 H95978 Homo sapiens phosphalidytserfne-specific 3.7 

130577 Hs.162 M35410 InsuUn-fike growth factor binding prote 3.7 
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117667 H&44708 N39214 ser-Thr protein kinase related to the my 3.7 

126104 Hs.39712 N77278 ESTs; Weakly similar to BONE/CARTILAGE P 37 

100379 Hs£7B721 D82060 Homo sapiens mRNA for membrane protein w 3.7 

115646 Hs.305971 AA404352 ESTs 3.7 

5 125792 Hs.193700 A10053B8 ESTs; Moderately simBar to IIII ALU SUB 3.7 

102162 Hs.1592 U18291 CDC16 (cell division cycle 16; S. cerevl 3.7 

128530 Hs.183475 AA504343 ESTs; Moderately similar to !Hl ALU SUB 3.7 

119840 Hs.272531 W86779 EST 3.7 

110769 Hs.23837 N22222 yw34b08.s1 Morton Fetal Cochlea Homo sap 3.7 

10 132914 Hs.60293 AA496037 ESTs 3.7 

113594 Hs.16683 T92030 ESTs 3.7 

103702 Hs.279952 AA027793 ESTs; Highly similar to (deftine not ava 3.7 

130780 Hs.19347 AA248406 ESTs 3.7 

123288 Hs.291025 AA495836 EST 3.7 

15 120691 HS22380 AA291173 ESTs 3.7 

103153 Hs.75295 X66534 guanylate cyclase 1; soluble; alpha 3 3.7 

129201 Hs.109390 H19989 ESTs 3.7 

114798 Hs54900 AA159181 ESTs 3.7 

126801 Hs.7337 AA512902 ESTs 3.7 

20 105503 Hs.31707 AA256616 ESTs 3.7 

104260 Hs.194283 AF008192 Homo sapiens putative GR6 protein (GR6) 3.7 

125980 Hs.35699 R97219 ESTs 3.7 

123255 Hs.105273 AA480890 ESTs 3.6 

103862 Hs.6363 AA206625 ESTs 3.6 

25 100696 Hs.121686 HG3162-HT3339 Transcription Factor !fa 3.6 

134917 Hs.166994 X87241 FAT tumor suppressor (DrosophSa) homolo 3.6 

103520 Y10511 H.sapiens mRNA for CD176 protein 3.6 

113778 Hs.302738 W15263 ESTs 3.6 

101838 Hs.75511 M92934 connective tissue growth factor 3.6 

30 113702 T97307 ESTs; Moderately similar to Hll ALU SUB 3.6 

118201 Hs.48428 N59800 EST 3.8 

116519 Hs.68554 C20780 EST 3.6 

105886 Hs^2983 AA400517 ESTs; Moderately similar to UDP-GLUCOSE: 3.6 

106709 Hs.170291 AA464696 ESTs 3.6 

35 127858 Hs.27973 AA806365 oc26h07^1 NCLCGAP_GCB1 Homo sapiens cO 3.6 

101964 S81578 dloxin-responstve gene {putative polyade 3.6 

105508 Hs.326416 AA256680 ESTs 3.6 

116844 HS337434 H64938 ESTs 3.6 

105372 Hs.142296 AA236481 ESTs 3.6 

40 100745 Hs.144630 HG3510-HT3704 V-Erba Related Ear-3 Protein 3.6 

127521 Hs.164018 AA809882 ESTs 3.6 

110758 Hs274265 N21365 talin 3.6 

107307 Hs.44155 T52099 creatine kinase; mitochondrial 2 (sarcom 3.6 

133200 Hs.183639 AA432248 ESTs 3.6 

45 114774 Hs.184325 AA150043 ESTs 3.6 

120265 Hs.270696 AA173759 ESTs; Moderately similar to Hll ALU SUB 35 

134359 Hs.199067 M34309 v-erb-b2 avian erythroblastic leukemia v 3.6 

116250 Hs.44829 AA480975 ESTs; Moderately slmflar to Hll ALU SUB 3.6 

106313 Hs.35841 AA436459 nuclear factor !/X (CCAAT-bindtng transc 3.6 

50 131898 Hs.279780 N52232 ESTs 3.6 

133444 Hs.73793 M27281 vascular endothelial growth factor 3.6 

128232 Hs.334641 H06296 ESTs 3.6 

135357 Hs.79572 AA235803 ESTs - 3.5 

457951 AI369384 arytsulfatase O 35 

55 108407 AA075519 zm87h9.s1 Stratagene ovarian cancer (#93 35 

126659 T16245 a dlsintegrfn and metattoproteinase doma 35 

104189 Hs.301804 AA485805 ESTs 3.5 
125956 Hs.129014 N53276 ESTs 35 

103026 Hs.79386 X54162 Human mRNA for a 64 Kd autoantigen expre 35 

60 133011 Hs.171921 AA042990 sema domain; immunoglobulin domain (Ig); 35 
131379 HS.26176 R49035 ESTs 35 

126742 Hs.169359 H64106 yr57e06j1 Soares fetal liverspleen 1NF 35 

105560 Hs.306915 AA262783 ESTs 3.5 
118472 Hs.42179 N66818 ESTs 3.5 

65 105623 Hs.30127 AA280895 ESTs; Highly similar to 111! ALU SUBFAMI 35 
120262 Hs.145807 AA172076 ESTs; Moderately similar to till ALU SUB 35 
105027 H&28771 AA126472 ESTs 35 
130760 Hs.16953 AA126997 phosphodiesterase 8A 35 
117473 Hs.155560 N30157 ESTs 35 
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102683 Ks.168075 U70322 fcaryopherin (importln) beta 2 33 

126349 Ks.13531 AA442858 ESTs; Weakly similar to {defiine not ava 33 

132154 Hs.41119 N67179 ESTs 33 

131689 Hs.30696 AAS99653 transection factor-Oka 5 (basic helix 33 

5 127862 Hs.163191 AA765305 EST 33 

126995 Hs.189810 W26950 Human DNA sequence from PAG 388M5 on chr 33 

119071 R31180 ESTs 33 

103941 Hs.96593 AA282878 ESTs 33 

110721 Hs.31319 H97678 ESTs 33 

10 "126586 Hs.43086 AA011247 ESTs 33 

103106 Hs.1857 X62025 phosphodiesterase 6Q; cGMP-specffic; rod 33 

116357 Hs.90797 AA504806 Homo sapiens done 23620 mRNA sequence 33 

105309 Hs.4104 AA233790 ESTs 33 

130796 Hs.19525 R39390 ESTs 33 

15 109101 HS321B4 AA167708 ESTs 33 

103134 Hs.2839 X65724 Nome disease (pseudogSoma) 33 

131798 Hs.301449 X86098 adenovirus 5 E1A binding protein 33 

118535 Hs.49418 N87968 ESTs 33 

102592 Hs.11223 U62389 Human putative cytosolic NADP-dependent 3.4 

20 125905 Hs.6456 T69B68 chaperonin containing TCP1;subunit 2 (b 3.4 

109160 Hs.301997 AA179387 ESTs 3.4 

105327 Hs.211593 AA234440 ESTs 3.4 

106586 Hs37787 AA456598 ESTs 3.4 

122635 AA454085 EST 3.4 

25 132413 Hs.2601 16 AA132969 metailoprotease 1 (pitrifysin family) 3.4 

131938 Hs.34956 AA283620 ESTs 3.4 

133871 Hs.182793 AA454597 ESTs 3.4 

107175 Hs.292503 AA621751 ESTs; Weakfy similar to KIAA0601 protein 3.4 

101188 Hs.184298 L20320 cyciin-dependent kinase 7 (homolog of Xe 3.4 

30 126422 Hs.237658 H48518 ESTs; Highly similar to apolipoproteln A 34 

118475 N66845 ESTs; Weakly similar to ill! ALU CLASS B 3.4 

104558 Hs.88959 R56678 ESTs; Weakly similar to !!!! ALU SUBFAMI 3.4 

128307 Hs.132005 A1453794 ESTs 3.4 

112254 Hs.25829 R51831 ESTs 3.4 

35 125408 Hs.89578 N72353 yv37e12^1 Scares fetal liver spleen 1NF 3.4 

109834 Hs.175955 H00604 ESTs 3.4 

130844 Hs.20191 D12122 seven in absentia (DrosophBa) homolog 2 3.4 

127143 Hs.20843 AA533553 nJ68h04^1 NCLCGAPJ'rIO Homo sapiens cO 3.4 

135309 Hs.42500 D25984 ESTs 3.4 

40 125724 Hs.295978 AA083407 stimulated trans-acting factor (50 kDa) 3.4 

127692 Hs.187983 AIQ21912 ESTs 3.4 

116674 Hs.92127 F04816 ESTs 3.4 

134700 Hs.8868 AM81414 goigl SNAP receptor complex member 1 3.4 

114846 Hs.166196 AA234929 ESTs 3.4 

45 103849 Hs.155983 Z70219 H^aplens mRNA for 5'UTR far unknown pro 3.4 

134835 Hs.89925 L04569 calcium channel; voltage-dependent; L ty 3.4 

130568 Hs,16085 AA232535 ESTs; Highly similar to (define not ava 3.4 

111331 Hs.15978 N78773 ESTs 3.4 

106035 Hs.10653 AA412505 ESTs 3.4 

50 130987 Hs.21893 R45698 ESTs 3.4 

112814 H3.35828 R98192 ESTs 34 

127815 Ha£55015 AA676009 ob93c10.s1 NCI_CGAPJ3CB1 Homo sapfens cD 3.4 

100144 Hs.75616 D 13543 K1AA0018 gene product - 3.4 

101129 Hs.247992 L10405 Homo sapiens DNA binding protein for sur 3.4 

55 130874 Hs^0621 T08287 ESTs 3.4 

106882 Hs.26994 AA489009 ESTs 3.4 

103855 Hs.302267 AA195179 ESTs 3.4 

125957 H45213 yo03b08.r1 Scares adult brain N2b5HB55Y 3.3 

114048 Hs.146085 W94613 ESTs 35 

60 109826 Hs.75354 F13702 ESTs 3.3 

125355 Hs.170098 R45630 ESTs; Highly similar to KIAA0372 [H.sapi 3.3 

104182 Hs.143792 AA479990 ESTs; Weakly similar to glioma amplified 3.3 

100294 Hs.75454 049396 Human tnRNA for Apol_Hurnan (MER5(Aop1-Mou 3.3 

131688 Hs.30692 U24153 p21 (COKN1A)-activated kinase 2 3.3 

65 116256 Hs.88201 AA481256 ESTs; Weakly similar to (defiine not ava 33 

102034 Hs.230 U 05291 fibromodulin 33 

130072 Hs.14658 R99606 Human chromosome 5q13.1 clone 5G8 mRNA 3.3 

114615 Hs.159456 AA083812 ESTs; Highly similar to (defiine not ava 33 

128707 Hs.104105 AA136474 Mels (mouse) homolog 2 33 
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115048 Hs.180057 AA252668 ESTs 3.3 

125862 Hs.31110 H12084 ESTs 3.3 

135142 Hs.24192 R31679 ESTs 3.3 

103119 Hs.2877 X63629 cadherin 3; P-cadherin (placental) 3.3 

5 104460 Hs.62804 M91504 ESTs 3.3 

100365 Hs79284 D78611 mesoderm specific transcript (mouse) horn 3.3 

131524 Hs.301804 N39152 ESTs 3.3 

102165 Hs.159627 U18321 Death associated protein 3 3.3 

126966 Hs.182575 R38438 solute carrier family 15 (H^eptide tra 3.3 

10 124839 Hs.140942 R55784 ESTs 3.3 

100709 Hs.100469 HG3264-HT3441 Af-6 (Gb:U02478) 3.3 

132967 Hs.61635 AA032221 Homo sapiens BAC clone RG041 D1 1 from 7q2 3.3 

102927 Hs.65114 X12876 keratin 18 3.3 

132616 Hs.283558 AA386264 ESTs 3.3 

15 125132 Hs.129781 W15495 ESTs 3.3 

111225 Hs.31652 N68989 ESTs 3.3 

114956 H8.87113 AA243681 ESTs 3.3 

122235 Hs.1 12227 AA438475 ESTs 3.3 

112325 Hs.12315 R56055 ESTs 3.3 

20 123360 Hs.178604 AA504784 ESTs 3.3 

105150 Ms.155995 AA169640 Homo sapiens mRNA for KIAA0643 protein; 3.3 

107391 Hs.284294 W02877 ESTs 3.3 

113058 Hs.7569 T26893 EST 3.3 

134371 Hs.82318 $69790 Brush-1 3.3 

25 125669 Hs.333256 R51308 ESTs; Moderately similar to !!!! ALU SUB 3.3 

111506 Hs.294105 R07726 ESTs 3.3 

122974 H8.194215 AA478625 ESTs 3.3 

102369 Hs.299867 U39840 hepatocyte nuclear factor 3; alpha 3.3 

120408 HS.190151 AA235045 ESTs 3.3 

30 117993 Hs.47402 N52039 ESTs; Weakly similar to I!!! ALU SUBFAMI 3.3 

129586 Hs.1 1500 AA437118 ESTs 3.3 

128138 Hs.126494 AI200825 ESTs 3.3 

127265 AA332751 EST37214 Embryo, 8 week I Homo sapiens c 3.3 

107674 Hs.41143 AA011027 Homo sapiens mRNA for KIAA0581 protein; 3.2 

35 104866 Hs.293691 AA045342 ESTs 32 

103427 Hs.250655 X97303 H^apiens mRNA for Ptg-12 protein 32 

132990 Hs.334334 AA458761 ESTs 35 

127017 Hs.251946 AA740146 ESTs 32 

132313 Hs.44481 U13220 forkhead (Drosophaa)-fike 6 32 

40 106880 Hs.32425 AA488889 ESTs 3.2 

107039 Hs.169780 AA599751 homologous to yeast nitrogen permease (c 32 

120870 Hs292581 AA357172 ESTs 32 

107920 Hs2B4207 AA027951 ESTs 32 

104165 Hs.105116 AA459160 EST 32 

45 107012 Hs.63908 AA598745 ESTs 32 

103605 Hs.194657 Z354Q2 Ksaplens gene encoding E-cadherfn, exon 32 

124006 HSJ270016 D60302 ESTs 32 

101300 Hs.74137 L40391 Homo sapiens (clone s153) mRNA fragment 32 

101183 Hs.795 L19779 H2AhIstone family; member 0 32 

50 125596 R25698 yg44h1 1 .r2 Soares infant brain 1 NIB Homo 32 

127261 AA661567 nu86b02.s1 NCLCGAPJW1 Homo sapiens cD 32 

120090 Hs.59554 W94591 ESTs 32 

129393 Hs.166982 D13435 phosphatidylinosltol gtycan; class F - 32 

120923 Hs.97129 AA382283 ESTs 3.2 

55 118907 Hs.274256 N91003 ESTs 32 

111552 Hs.191185 R09411 ESTs 32 

104431 Hs.99913 J03019 adrenergic; beta-1-; receptor 32 

133551 Hs.278634 D63480 Human mRNA for KIAA0146 gene; partial cd 32 

131615 Hs.192803 D14533 xeroderma pigmentosum; complementation g 3.2 

60 126547 Hs.84072 U47732 transmembrane 4 superfamily member 3 32 

103172 Hs.1 16774 X68742 fntegrfn; alpha 1 32 

113867 Hs.24095 W68845 ESTs 32 

133323 Hs.70937 Z83735 H3 histone family; member K 32 

111597 Hs.189716 R11499 ESTs 32 

65 121515 Hs.104698 AA412133 ESTs 32 

107445 Hs.6639 W28406 ESTs 32 

106887 Hs.334335 AA489091 ESTs 32 

123052 Hs.185766 AA481806 ESTs 32 

107072 Hs.130760 AA609113 Homo sapiens mRNA; cDNA DKFZp586N0318 (f 32 
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102214 Hs.32964 U23752 SRY (sex-determining region Y)-box 11 3.2 

123147 AA487961 ab11h6.s1 Stratagene lung (#93721) Homo 3.2 

125435 Hs.272138 R00940 ye87g03.r1 Soares fetal IrVer spleen 1NF 35 

116246 Hs.250646 AA4 79951 ESTs; Highly similar to ublquftin-oonlug 32 

5 105169 Hs.180769 AA180321 Homo sapiens (clone S164) mRNA; 3 1 end o 3.2 

134001 Hs.78344 AF001548 myosin; heavy polypeptide 11; smooth mus 3.2 

124866 Hs.304389 R68571 ESTs 3.2 

133205 Hs.67619 AA089559 Homo sapiens mRNA; chromosome 1 specific 3-2 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 35 

10 101232 Hs.242894 L28997 ADP-ribosylatlon factor-like 1 3.1 

132906 Hs.234896 AA142857 ESTs; Highly similar to geminin [H^apie 3.1 

104281 Hs.5669 C14290 ESTs 3.1 

123926 Hs.227933 AA621348 ESTs; Highly similar to (defline not ava 3.1 

134464 Hs.239720 N79354 ESTs; Weakly similar to Rga [D.melanogas 3.1 

15 105322 Hs.16348 AA234100 ESTs 3.1 

100631 Hs.46332 HG2709-HT2805 Serine/nireonlne Kinase (GbZ25431) 3.1 

130791 Hs.199263 AA259102 ESTs; Highly similar to (defline not ava 3.1 

131220 Hs.300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 Hs.m68 AI494372 ESTs 3.1 

134110 Hs.79136 U41060 Human breast cancer; estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately similar to 111! ALU SUB 3.1 

107439 Hs.296842 W27995 ESTs; Moderately simitar to non-muscle m 3.1 

125863 Hs.40719 AA299096 Homo sapiens mRNA; cDNA DKFZp564M0916 (I 3.1 

25 105811 Hs£86192 AA394121 ESTs 3.1 

129284 Hs.296141 AA104023 ESTs 3.1 

125321 Hs.178294 T86652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 Hs.90800 D83646 matrix metalloproteinase 16 (membrane-in 3.1 

109063 Hs.38972 AA161043 tetraspanl 3.1 

133284 Hs.182828 U09367 zinc finger protein 136 (clone pHZ-20) 3.1 

131839 Hs.33010 H80622 Homo sapiens mRNA for KIAA0633 protein; 3.1 

117606 Hs.44698 N35115 ESTs 3.1 

35 418998 Hs.287849 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 

100789 HG3893-HT4163 Phosphoglucomutase 1, Ait Splice 3.1 

126017 Hs.159440 H60487 ESTs 3.1 

132452 Hs.247324 AA005262 Homo sapiens DNA sequence from PAC 262D1 3.1 

40 129077 Hs.108479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 U5 snRNP-specific protein (220 kD); orth 3.1 

129650 Hs.116258 N52554 ESTs 3.1 

123465 AA599033 ESTs 3.1 

126486 Hs.152316 AA345339 EST51345 Gail bladder II Homo sapiens cO 3.1 

45 126460 Hs.167031 W01616 za36d05.rt Scares fetal liver spleen 1NF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 Hs.38057 AA203742 ESTs 3.1 

127968 Hs.124347 AA971439 ESTs 3.1 

124984 H&223241 T47566 yb15c11.s1 Stratagene placenta (#937225) 3.1 

50 103903 Hs.15220 AA249334 j312.seq.F Human fetal heart, Lambda ZAP 3.1 

106697 Hs.22242 AA463737 ESTs 3.1 

130892 Hs.20993 AA442604 ESTs; Weakly similar to Ydr374cp [S.cere 3 

114032 Hs.35014 W92779 ESTs - 3 

128835 Hs.106390 W15528 ESTs 3 

55 103667 HS247815 Z80788 H.sapiens H4/1 gene 3 

126264 Hs.250614 N42897 yy13h06,r1 Soares melanocyte 2N5HM Homo 3 

132626 Hs.21275 D25755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126780 Hs.5811 R12421 ESTs 3 

60 127363 Hs22116 AA307744 Homo sapiens Cdc1481 phosphatase mRNA; c 3 

103690 Hs.84063 AA016186 ESTs 3 

102589 Hs.8867 U62015 Homo sapiens Cyr61 mRNA, complete cds 3 

125144 Hs.24336 W37999 ESTs 3 

132977 Hs.301404 U28686 RNA binding motif protein 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.79411 J05249 replication protein A2 (32kD) 3 

102856 Hs.248177 X00090 Human hbtone H3 gene 3 

105516 Hs.30738 AA257971 ESTs 3 

131137 Hs.33287 U85193 nuclear factor l/B 3 
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127221 H&241551 AI354332 ESTs 3 

411888 Hs.24104 R26708 ESTs 3 

131684 Hs.3066 U26174 granzyme K (serine protease; granzyme 3; 3 

100629 Hs.21291 HG2706-HT2802 Serine/Threonine Kinase (Gb225428) 3 

5 119944 Hs.58915 W86838 EST 3 

113801 Hs.1 18281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 decorin 3 

104690 Hs.14449 AA010889 ESTs 3 

126371 Hs.304139 N57645 EST 3 

10 127635 Hs.1 16346 AA766903 ESTs 3 

128434 Hs.143880 AI190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 Hs.50748 T71561 ESTs 3 

124940 Hs.1 03804 R99599 heterogeneous nuclear ribonucleoproteln 3 

15 128742 Hs.251531 D00763 proteasome (prosome; macropain) subunil; 3 

107147 Hs.10450 AA621125 Homo sapiens chromosome 2; 10 repeat reg 3 

112068 HSJ22545 R43910 ESTs 3 

1(6346 Hs.263727 AA235465 ESTs; Moderately similar to !!!! ALU SUB 3 

130972 Hs.21739 AA370302 Homo sapiens mRNA;cDNADKFZp586l1518(f 3 

20 131230 Hs.274407 AA149987 thymus specific serine peptidase 3 

133743 Hs.75847 N79435 ESTs 3 

1274Q2 Hs.227949 AA358869 ESTs; Highly similar to SEC13-RELATED PR 3 

117483 Hs.44189 N30426 ESTs 3 

123659 Hs.112699 AA609368 ESTs 3 

25 103963 Hs.63290 AA298588 EST1 14219 HSC172 cells H Homo sapiens c 3 

103795 Hs.7367 Ml 12222 ESTs; Moderately similar to {defline not 3 

115092 Hs.80975 AA255903 CD39-like4 2.9 

134831 Hs.89890 S72370 pyruvate carboxylase 2.9 

128579 Hs.101810 AA093378 ESTs; Weakly similar to HQ ALU SUBFAMl 2.9 

30 134193 Hs.7980 F09570 ESTs 2.9 

123522 Hs.1 12575 AA608577 ESTs 2.9 

107109 Hs.32793 AA609943 ESTs 2.9 

134694 Hs.88556 D50405 histone deacstyiase 1 2.9 

134399 Hs.82689 H99801 tumor rejection antigen (gp96) 1 2.9 

35 134632 Hs.174139 AA398710 H. sapiens RNA for CLCN3 2.9 

106683 Hs.14512 AA461495 ESTs 2.9 

1 08555 AA084963 zn13e12.s1 Stratagene hNT neuron (#93723 2.9 

100953 Hs.2110 HG945-HT945 Nucleic Acid-Binding Protein (Gb:L12693) 2.9 

130597 Hs.16492 AA1 73998 ESTs; Weakly similar to weakly similar t 2.9 

40 101813 Hs.139226 M87338 replication factor C (activator 1)2 (40 2.9 

106636 Hs.266 AA459950 ESTs 2.9 

129109 Hs.108708 AA491295 catdunVcalmoduliiHiependent protein kin 2.9 

125819 Hs.251871 AA044840 stromal cell-derived factor 1 2.9 

106282 Hs.9857 AA433946 ESTs; Weakly similar to (defline not ava 2.9 

45 100386 Hs.301636 D83703 peroxisomal biogenesis factor 6 2.9 

114546 Hs.98074 AA056263 ESTs; Moderately similar to III! ALU SUB 2.9 

105914 Hs.9701 AA4Q2224 Homo sapiens growth arrest and DNA-damag 2.9 

108552 AA084912 zn11c7.s1 Stratagene hNT neuron (#937233 2.9 

126505 Hs.190057 W26894 16a11 Human retina cONA randomly primed 2.9 

50 134098 Hs.79086 X06323 Human MRL3 mRNA for ribosomal protein L3 2.9 

129721 Hs.211539 L19161 eukaryotic translation initiation factor 2.9 

100076 Hs277422 ABD00897 Homo sapiens mRNA for cadherin FIB3, par 2.9 

117466 Hs.44104 N29862 ESTs • 2.9 

106335 Hs.36688 AA437258 ESTs; Moderately similar to WAP four-dls 2.9 

55 134510 Hs.250870 U25265 protein kinase; mitogen-actrvated; kinas 2.9 

105835 Hs.32995 AA398412 ESTs 2.9 

106611 Hs.26267 AA458904 ESTs; Weakly similar to torslnA [Ksaple 2.9 

134087 Hs.173824 U51166 thymlne-DNA glycosylate 2.9 

100641 Hs.182183 HG2743-HT2848 Caldesmonl, Alt Splice 4, Non-Muscle 2.9 

60 104602 R86920 ESTs 2.9 

117203 Hs.42738 H99799 ESTs 2.9 

131889 Hs.34073 AA401912 BH-protocadherin (brain-heart) 2.9 

101707 Hs.155212 M65131 methylmaionyl Coenzyme A mutase 2.9 

115271 Hs.5724 AA279422 ESTs 2.9 

65 125812 Hs.287912 H73420 lectin; mannose-binding; 1 2.9 

110740 Hs.19762 H99675 ESTs 2.9 

103406 Hs.235728 X95677 Rsaptens mRNA for ArgBPlB protein 2.9 

• 104577 Hs.132390 R71539 ESTs 2.9 

102772 Hs.161002 U83115 absent In melanoma 1 2.9 
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131710 Hs.30985 AA233225 ESTs; Highly similar to (define not ava 2.9 

125231 Hs.268903 W84714 ESTs 2.9 

127380 Hs.15535 AI417137 Homo sapiens clone 24582 mRNA sequence 2.9 

104229 Hs.61289 AB002346 inositol phosphate S'-phosphatase 2 (syn 2.9 

5 126600 Hs.191385 AA699949 ESTs 2.9 

125175 Hs.303030 W523S5 EST 25 

103849 Hs.34578 AA187045 ESTs; Weakly similar to 111! ALU SUBFAMi 2.9 

102126 Hs.78961 U14575 protein phosphatase 1; regulatory (inhib 2.9 

124906 Hs.107815 R87647 ESTs 2.9 

10 131148 Hs.303125 C00038 ESTs 25 

123158 Hs.218329 AA488658 heat shock 70KD protein 1 2.9 

133667 Hs.75462 U72649 Human BTG2 (BTG2) mRNA; complete cds 2.9 

105182 Hs.18271 AA191014 ESTs; Weakly similar to Ydr372cp [S,cere 2.9 

133968 Hs.232068 D15050 Human mRNA for transcription (actor AREB 2.9 

15 117425 Hs536901 N27154 ESTs 2.9 

111087 Hs.37637 N59645 ESTs 2.9 

129641 Hs.11805 N66066 ESTs 2.9 

128639 Hs.102897 N91246 ESTs 2.9 

133209 Hs.79265 AA1 14183 ESTs; Moderately similar to gtutamate py 2.9 

20 135154 Hs.267812 AA126433 sorting nexin 4 25 

126838 Hs.279609 AA858097 pigment epithelium-derived factor 25 

103803 Hs.106149 AA127696 ESTs 25 

102139 Hs.2128 U15932 dual specificity phosphatase 5 2.9 

128104 AA971000 op67g11.s1 Soares_NFLJ_GBC_S1 Homosapi 2.8 

25 127834 Hs.337631 AA761415 nz22d08.s1 NCL_CGAP_GC81 Homo sapiens cD 2.8 

133101 Hs.180852 AA488230 ESTs 25 

127250 Hs.217916 A1023717 ESTs 25 

135063 Hs.93883 D10537 myelin protein zero (Charcot-Marie-Tooth 2.8 

126323 Hs.68644 N45014 yy80g06.r1 Soares.multiple_scterosls._2Nb 25 

30 121873 Hs.145696 AA426270 ESTs 2.8 

122090 Hs.88684 AA432141 ESTs 2.8 

118728 Hs.322645 N73705 ESTs 25 

135400 Hs.99915 M23263 androgen receptor (dihydrotestosterone r 2.8 

125278 Hs.129998 W93523 ESTs 2.8 

35 124387 Hs.109019 N27637 ESTs 2.8 

124803 Hs.12185 R45480 cyclinK 25 

H45968 Hs.32149 H45968 ESTs 25 

104261 Hs5409 AF008442 RNA polymerase I subunit 2.8 

105366 Hs.282093 AA236356 ESTs 2.8 

40 106070 Hs.5957 AA417761 Homo sapiens clone 24416 mRNA sequence 2.8 

131356 Hs25860 M13241 v-myc avian myelocytomatosis viral relat 2.8 

112009 Hs.26255 R42714 EST 25 

133199 Hs.250175 AA609773 Homo sapiens clone 23904 mRNA sequence 2.8 

110379 Hs.33130 H44825 ESTs 25 

45 103890 Hs.72085 AA236843 ESTs; Weakly similar to unknown [S.cerev 2.8 

128152 R20353 yg20f10/1 Soares infant brain 1NIB Homo 2.8 

107008 Hs.23740 AA598710 ESTs 25 

135243 HS57101 AA215333 ESTs 2.8 

103058 Hs.184510 X57348 stratifm 25 

50 132020 Hs.293845 AA428990 ESTs 25 

116354 Hs.292566 AA504262 ESTs 2.8 

125867 Hs.12372 H98141 ESTs 2.8 

120603 Hs.98541 AA282787 ESTs; Highly similar to (defGne not ava * 25 

115119 Hs.46847 AA256524 Human DNA sequence from clone 30M3 on ch 25 

55 133865 Hs.170290 F09315 discs; large (DrosophBa) homolog 5 25 

109415 Hs.1 10826 AA227219 Homo sapiens CAGF9 mRNA; partial cds 2.8 

128687 Hs.23767 Z38910 ESTs 2.8 

109984 Hs.10299 H09594 ESTs; Moderately similar to till ALU SUB 25 

133179 Hs.66731 U81599 homeoboxB13 25 

60 115998 Hs.338629 AA448488 ESTs; Weakly similar to zinc finger prot 25 

112180 Hs.25067 R49116 EST 25 

120428 Hs.173694 AA236822 ESTs; Moderately similar to (defline not 25 

106241 Hs.6019 AA430108 ESTs 2.8 

131060 Hs.22564 AA160890 myosin VI 25 

65 111383 Hs.40919 N94527 ESTs 2.8 

102123 Hs.1594 U14518 centromere protein A (17W>) 2.8 

102722 Hs.79981 U79242 Human clone 23560 mRNA sequence 2.8 

129887 Hs.274324 W92041 PCAF associated factor 65 alpha 25 

126663 Hs.181297 AA714635 ESTs 2.8 
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104367 Hs.134342 H17438 
107316 Hs.193700 T63174 
126050 Ks.145096 AA972446 
124447 N46000 
5 111398 HS.125585 R00086 
134085 Hs.79018 U20979 
124788 Hs.100912 R43543 
112248 Hs.326416 R51361 
121309 Hs.97312 AA402482 

10 103076 Hs.75319 X59618 
107071 Hs.35198 AA609053 
104425 Hs.35380 H88498 
132991 Hs.62245 AA446906 
104968 Hs.29669 AA084602 

15 121153 Hs.97694 AA399640 
131216 Hs.243901 D31058 
109682 Hs.22859 F09299 
131990 Hs.168818 H77734 
132027 Hs.181444 N78844 

20 127383 Hs.190478 AA447990 
132598 HS530 M81379 
101121 Hs.1313 L09753 
123000 Hs.105840 AA479347 
121329 Hs.1755 AA404324 

25 100481 Hs.121489 HQ1098-HT1098 
113803 Hs.283683 W42789 
110934 Hs.169001 N48708 
432888 T86823 
121802 Hs.188898 AA424328 

30 130396 Hs.155313 AB002331 
121103 Hs.97697 AA398936 
131129 Hs.23240 R27296 
130943 Hs.272429 D50855 
134676 Hs.87819 W28051 

35 111900 Hs.25318 R39044 
106025 Hs.173334 AA412063 
126144 Hs.40639 N39696 
103248 Hs.75262 X77383 
127230 Hs.274170 H30501 

40 101584 Hs.84072 M35252 
124131 Hs.167489 H19980 
129689 Hs.77873 AA130156 
132892 Hs.9973 W92797 
120827 Hs.132967 AA347717 

45 134579 Hs.85963 N23222 
106149 HS356301 AA424881 
132037 Hs.332541 AA203649 
130542 Hs.179825 U64675 
122851 Hs.99598 AA463627 

50 134983 H 3. 196384 D28235 
120537 Hs.160422 AA262790 
131036 Hs.174140 X64330 
133889 HS211582 AA099391 
128847 Hs,106529 AM24199 

55 112755 Hs.306044 R938Q2 
423239 AA323591 
105031 Hs.12321 AA127240 
126021 Hs.187516 AA775894 
102116 U13706 

60 133394 Hs.237225 R16759 
104267 Hs.278439 C00358 
107614 Hs.40241 AA004878 
129809 Hs.1259 X55283 
112109 Hs.283309 R45221 

65 128422 T85681 

109494 Hs.43899 AA233702 
118696 Hs.292284 N72086 
106053 Hs.36727 AA4 16963 
104440 Hs.264380 120492 



ESTs; Weakly similar to seventransmembra 2.8 

ESTs; Moderately similar to Hi! ALU SUB 2.8 

ESTs 2.8 

ESTs 2.8 

deafness; X-finked 1 ; progressive 2.8 

chromatin assembly factor I (180 kDa) 2.8 

ESTs 2.8 

ESTs 2JS 

ESTs 2.8 

ribonucleotide reductase M2 polypeptide 2.8 

ESTs 2.8 

ESTs 2.8 

solute carrier family 25 (mitochondrial 2.8 

ESTs 2.8 

ESTs 2,8 

ESTs 2.8 

ESTs 2.8 

ESTs; Moderately similar to roundabout 1 2.8 

ESTs; Weakly similar to R12C12.6 [C.eleg 2Z 

ESTs 2.8 

collagen; type IV; alpha 3 (Goodpasture 2.8 

tumor necrosis factor (ligand) superfam! 2,8 

ESTs 2.8 

ESTs 2.8 

CystatinD 2.7 

ESTs 2.7 

ESTs; WeaWy similar to cytochrome P450 2.7 

ESTs 21 

ESTs 2.7 

Human mRNA for KIAA0333 gene; partial cd 2.7 

ESTs; Weakly similar to (defiine not ava 2.7 

ESTs 2-7 

calcium-sensing receptor (hypocalciurlc 2.7 

ESTs; Weakly similar to keratin 9; cytos 27 

ESTs 27 

ESTs 27 

yx92a07j1 Soares melanocyte 2NbHM Homo 27 

cathepsin O 27 

Homo sapiens Opa-interacting protein OIP 2.7 

transmembrane 4 superfamily member 3 2.7 

ESTs 27 

ESTs 27 

ESTs 27 

ESTS 27 

ESTs; Moderately similar to 111! ALU SUB 2.7 
ESTs 27 
ESTs; WeaWy similar to HEM45 [Rsapiens 27 
Human sperm membrane protein BS-63 mRNA, 27 
ESTs 2-7 
prostaglandin-endoperoxfde synthase 2 (p 27 
ESTs 27 
ATP citrate lyase 27 
ESTs " 27 
zv81e01.r1 SoaresJotaLfetus_Nb2HF8Jw 27 
ESTs 27 
EST26392 Cerebellum II Homo sapiens cDNA 2.7 
ESTs 27 
ESTs 27 
Human ELAV-like neuronal protein 1 tsofo 27 
ESTs; Weakly similar to (defiine not ava 27 
ESTs 27 
ESTs; Highly similar to (defiine not ava 27 
asialoglycoprotein receptor 2 27 
ESTs; Weakly similar to ftU ALU SUBFAMI 27 
yd60c06 j1 Soares fetal liver spleen 1 NF 27 
ESTs 27 
Homo sapiens RNA polymerase III largest 27 
ESTs; Highly simitar to histone H2A [H.a 27 
gamma-glutamyitransferase 1 27 
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129426 Hs.1 1 1 323 AA412087 EST; Highly similar to (defline not aval 2.7 

123798 AA620411 small inducible cytokine A5 (RANTES) 2.7 

106716 Hs.238928 AA464962 ESTs 2.7 

103663 278291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

5 114182 Hs.22265 238909 ESTs 2.7 

113063 Hs,5027 T32438 ESTs 2.7 

127897 AA773857 af80c09.M Soares_NhHMPu_S1 Homo sapiens 2.7 

130621 Hs.16303 AA621718 ESTs; Weakly similar to (defline not ava 2.7 

1 16245 Hs.42796 ' AA479958 ESTs; Highly similar to (defline not ava 2.7 

10 125499 R11878 yf49d1 1 .r1 Soares infant brain 1 NIB Homo 2.7 

133960 Hs.77899 M19267 tropomyosin 1 (alpha) 2.7 

104470 Hs.246358 N28843 ESTs; Weakly similar to Similar to coHa 2.7 

134982 Hs.92308 N48086 ESTs 2.7 

106803 Hs.284295 AM79114 ESTs 2.7 

15 104899 Hs.285574 AA054726 ESTs 2J 

125401 Hs.337585 AI204637 ESTs; Moderately similar to KIAA0350 [H. 2.7 

111253 Hs.15768 N70042 ESTs; Moderately similar to liil ALU SUB 2.7 

118449 Hs.164478 N66413 ESTs; Weakly similar to (defline not ava 2.7 

134507 Hs.84318 M63488 rcpticafJon protein A1 (70kD) 2.7 

20 121609 Hs.98185 AA416867 EST 2.7 

113835 Hs.27475 W56590 ESTs 2.7 

113962 Hs.285290 W86375 ESTs; Highly similar to (defline not ava 2.7 

121913 Hs.98558 AA428062 ESTs £7 

108194 Hs.216717 AA057250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 2.7 

123184 Hs.18166 AA489072 Homo sapiens mRNA for KIAA0870 protein; 2.7 

103420 Hs.173497 X97065 SEC23-iike protein B 2.7 

106186 Hs.6315 AA427398 acetylserotonlnN-methyitransferase-Iike 2.7 

101349 L77559 Homo sapiens DGS-B partial mRNA 2.7 

30 112954 HS.6655 T16559 ESTs 2.7 

133054 Hs.291079 R07876 ESTs; Weakly similar to unknown [S.cerev 2,7 

128131 Hs.25640 A1283162 claudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 Hs.26303 R40752 ESTs 2.6 

35 130145 Hs.151051 U07620 protein kinase mitogen-activated 10 (MAP 2.6 

126507 Hs.23964 AI362218 ESTs 2.6 

117903 Hs.47111 N50740 ESTs 2.6 

116345 Hs.199067 AA49S981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.8 

40 125746 Hs.274256 H03574 yj42b06rl Scares placenta Nb2HP Homo sa 2.6 

105073 Hs.89463 AA137034 ESTs 2.6 

102764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130792 Hs.19500 AA307896 nuclear localization signal deleted in v 2.6 

45 107427 Hs.46736 W26975 ESTs 2.6 

117477 Hs.44175 N30328 ESTs 2.6 

106290 Hs.16364 AA435542 ESTs 2.6 

126829 Hs.7910 R11547 ESTs 2.6 

118836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 osteoblast specific factor 2 (fasdcDn 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defline not ava 2.6 

135051 HS.B3484 C15324 ESTs 2.6 

126081 Hs.227835 A1346024 collagen; type I; alpha 1 * 2.6 

123579 AA608983 af5d4.s1 Soares_testis_NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-box binding protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI (plasma thrombopla 2.6 

122962 Hs.104720 AA478429 ESTs; Moderately similar to 111) ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 Hs.21851 D61676 Homo sapiens mRNA; cDNADKFZp586J21 18 (f 2.6 

60 128919 Hs. 103391 L27559 Insulin-like growth factor binding prote 2.6 

130296 Hs.154103 R09286 UM protein (similar to rat protein kina 2.6 

128402 Hs.191637 AA457244 ESTs 2.6 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 ESTs 2.6 

65 132953 Hs.321264 AA029927 ESTs 2.6 

130963 HsJ21639 U57099 nuclear protein; marker for differential 2.6 

120614 Hs.194154 AA284281 ESTs; Weakly similar to Hll ALU SUBFAMI 2.6 

123251 Hs. 103267 AA490858 ESTs; Moderately similar to RabIn3[R.no 2.6 

121710 Hs.66744 AA418011 ESTs 2.6 
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125428 Hs.851 W74608 ESTs; Highly sImBar to (defline not ava 2.6 

115906 Hs.82302 AA436616 ESTs 2.6 

103432 AA076626 Homo sapiens clone 23851 mRNA sequence 2.6 

126191 Hs.191911 H07728 ESTs 2.6 

5 106164 Hs.281434 AA425773 ESTs 2.6 

111519 Hs.268615 R08165 ESTs 2.6 

134590 Hs.173840 W58612 ESTs 2.6 

102565 U59748 Human desert hedgehog (hDHH) mRNA, parti 2.6 

129879 Hs.13109 AA194973 ESTs 2.6 

10 114264 Hs.334609 Z40074 ESTs 2.6 

106236 Hs.21104 AA429951 ESTs 2.6 

135192 Hs.321709 AF00O234 purinergic receptor P2X; ligand-gated b 2.6 

109833 Hs.29889 H00580 ESTs 2.6 

105756 Hs.8535 AA303088 ESTs; WeaWy similar to transformation-r 2.6 

15 121422 Hs.97967 AM06210 ESTs 2.6 

130417 Hs.155485 U56522 Human huntingtin Interacting protein (HI 2.6 

124312 Hs.102329 H94647 ESTs 2.6 

108998 Hs.97199 AA156058 ESTs 2.6 

127081 Hs.180591 R88362 ESTs; Weakly similar to weak similarity 2.6 

20 129574 Hs. 11463 AA458603 ESTs; Weakly similar to (defline not ava 2.6 

112410 Hs.26904 R61680 ESTs 2.6 

123929 Hs.1 12981 AA621364 ESTs 2.6 

122905 Hs.104835 AA470070 ESTs 2.6 

116399 Hs.1 10637 AA599729 Homo sapiens homeobox protein A10(HOXA1 2.6 

25 130279 Hs.153934 AA424044 core-binding factor; runt domain; alpha 2.6 

130021 Hs.1 435 M24470 guanosine monophosphate reductase 2.6 

100585 Hs.199160 HG2367-HT2463 Trithorax Homotog Hrx 2.6 

104965 Hs.30177 AA084104 ESTs 2.6 

117711 Hs.46485 N45201 EST 2.6 

30 124792 Hs.48712 R44357 ESTs 2.6 

111299 Hs.74313 N73808 ESTs 2.6 

103616 Hs.32971 Z46973 -phosphoinositide-3-kInase; class 3 2.6 

133629 Hs.195614 D13642 KIAA0017 gene product 2.6 

126484 Hs.169977 A1086782 ESTs 2.6 

35 100858 HG4245-HT4515 Forkhead Family Afx1 2.6 

133547 Hs.301927 X02883 T-cell receptor; alpha 0W;C) 2.6 

126680 Hs.133865 FO7097 ESTs 2.6 

125739 Hs.92137 AA428557 v-myc avian myelocyte matosis viral oncog 2.6 

102276 Hs.10247 U30999 Human (memc) mRNA, 3'UTR 2.6 

40 1055B6 Hs.191538 AA279137 ESTs 2.6 

103978 Hs.34136 AA307443 ESTs 2.6 

125054 Hs.268601 T80622 ESTs; Weakly simitar to (defline not ava 2.6 

114212 Hs.21201 Z39338 ESTs; Highly similar to (defline not ava 2.6 

116959 Hs.40022 H79310 EST 2.6 

45 109228 Hs.306995 AA193366 ESTs 2.6 

133989 Hs.78202 U29175 SWl/SNF related; matrix associated; acfl 2.6 

100640 Hs.182183 HG2743-HT2845 Caldesmonl, Alt Splice 3, Non-Muscle 2.6 

133093 Hs.285996 AA598749 ESTs 2.6 

114306 HS.6540 Z40861 ESTs 2.6 

50 106060 Hs.171391 AA417287 C-terminal binding protein 2 2.5 

107748 Hs.60772 AA017258 EST 2.5 

100134 Hs.49 D13264 macrophage scavenger receptor 1 2.5 

133969 Hs.78 U13044 GA-blndlng protaln transcription factor; * 25 

130992 Hs.74316 AA455001 ESTs 2.5 

55 127493 Hs291701 AA808081 oc39a08.s1 NCLCQAP.GCB1 Homo sapiens cD 2.5 

132869 HS.203961 N26855 ESTs 2.5 

117570 Hs.44583 N34415 EST 2.5 

124644 Hs.109654 N91279 ESTs 2.5 

103558 Hs£785 Z19574 keratin 17 2.5 

60 132883 Hs.5897 AA047151 ESTs 25 

102009 Hs.82643 U02680 protein tyrosine kinase 9 25 

116058 Hs.20159 AA454156 ESTs 2.5 

121989 Hs.193784 AA430044 ESTs 25 

131257 Hs.24908 AA256042 ESTs 25 

65 100320 Hs.75275 D50916 homotog of yeast (S. cerevisiae) ufd2 2.5 

102959 Hs.121524 X15722 glutathione reductase 2.5 

132969 Hs.6166 AA047616 ESTs 25 

130869 Hs.2057 AA128100 uridine monophosphate synthetase (orotat 25 
129645 Hs.1 18131 L38928 5;10-methenyttetrahydrofolatB synthetase 25 
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126399 Hs.83883 AA 128075 
134069 Hs.78935 U29607 
109816 Hs.61960 F11013 
134801 Hs.89695 X02160 
104232 Hs.10587 AB002351 
107361 Hs.159486 U72513 
106057 Hs.289074 AM17067 
134252 Hs.80720 AA031782 
128062 Hs.105547 AA379500 
110009 Hs.6614 H10933 
111375 Hs.20432 N93696 
122642 Hs.99361 AA454186 
127999 Hs.69851 AA837495 
105029 Hs.13268 AA126855 
105082 Hs.26765 AA143763 



zl16d08j1 SoarBs_pregnanLuterus_NbHPU 


2.5 


Homo sapiens elF-2-associated p67 homolo 


2.5 


ESTs; Weakly stmilar to KIAA0176 [H.sapl 


2.5 


insulin receptor 


2.5 


Human mRNA for KIAA0353 gene; partial cd 


2.5 


Human RPL13-2 pseudogene mRNA; complete 


2.5 


ESTs 


23 


Homo sapiens mRNA; cONA DKFZp586B1722 (f 


2.5 


ESTs 


25 


ESTs 


2.5 


ESTs 


2.5 


ESTs 


2.5 


ESTs; Weakly similar to Wiskott-Aldrich 


2.5 


ESTs 


25 


ESTs; Weakly similar to Similarity to S. 


2.5 
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TABLE 1 A show the accession numbers for those primekeys lacking unigenelD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 



108552 111555 1 AA071210AA069899 AA071438 AA084912 AA084803 AA079371 AA079370 

126023 1596090J H57661 H56881 

126086 1606216 1 H75681 H70975 

102565 32479 1 AB010994 U59748 AA064680 

101964 48158_-7 S81578 

125499 1562851 1 H 1 0543 R1 1878 

125596 1708455 1 R25698 R56582 R56018 

118417 37186 1 AF080229 AF080231 AF08Q230 AF080232 AF080233 AF080234 BE550633 AI636743 AW614951 BE467547 AI680833 

AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AI214968 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AI631703 AI651023 AI807418 AW818140 AA502500 AI206199 AI671282 
AI352545 BE501O30 AI652535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
BE46661 1 AI206344 AA574397 AA348354 AI493192 

125661 327B27J AM91830 R50173 R55192 R50320 AI732306 AI732305 AI820727 AI820728 R55191 R50318 R5Q227 

125957 1583542J H41694H45213 

125982 1766315 1 R98091 W92898 

127248 227560 1 AA364195 AA325029 AW862050 

103731 112052 1 AA070545 AA131490 AA131373 

127261 231687 1 AA330501 AA661567 

127265 232391 1 AA331503 AA332751 AW962542 

126659 1541209 1 T16245 R19694 F13545 H10299 T66048 T65279 H18006 

127315 37938J" AF1 16622 A! 1 14507 AA640834 AA377999 

103806 112618 1 AA1 30614 AA07141O 

128104 502608 1 AA906093 M971000 

104602 524482J H47610R86920 

128152 297868J F07973 R20353 AA442660 

128422 1811283J T77794T85681 

127897 446527 1 AA773681 AA773857 

106566 120358 J BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 AI333584 

AI369742 AI039558 AI885095 AI476470 A1287650 AI865299 A1985381 AW592624 AW340136 AI266556 AA456390 
AI310815AA484951 

129735 44573^2 AI950087 N70208 R97040 N36809 AI3081 19 AW967677 N35320 AI251473 H59397 AW971573 R97278 W01059 

AW967671 AA908598 AA251875 AI820501 A1820532 W87891 T85904 U71456 T82391 8E328571 T75102 R34725 
AA884922 BE32B517 AI219788 AA884444 N92578 F13493 AA927794 AI560251 AW874068 AL134043 AW235363 
AA663345 AW008282 AA488964 AA283144 AI890387 AI950344 A1741348 A1689062 AA282915 AW102898 AI872193 
AI763273 AW173586 AW150329 AI653832 AI762688 AA988777 AA488892 AI356394 AW103813 AI539642 AA642789 
AA856975 AW505512 AI961530 AW629970 BE612881 AW276997 AW513601 AW512843 AA044209 AW856538 
AA180009 AA337499 AW981 101 AA251669 AA251874 AI819225 AW205862 AI683338 A1858509 AW276905 AI633006 
AA972584 AA908741 AW072629 AW513995 AA293273 AA969759 N75828 N22388 H84729 H60052 T92487 AI022058 
AA780419 AA551005 W60701 AW613456 AI373032 AI564269 F00531 H83488 W37181 W78802 R66056 AI002839 
R67840 AA300207 AW959581 T63226 F04005 
AA487961 

AA178953AA192740 



123147 
130529 
123579 
109175 
100789 
100858 



219802_-2 
158447J 
genbank_AA608983 
genbanK_AA1 80496 
tigr.HT4163 
tigr_HT4515 



AA180498 

S67998 

U10072 
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123798 579959 J M620411 AA287491 

102116 entrez_U13706 U 13706 
102398 entrezJJ42359 U42359 
102764 entre?JJB2310 U82310 
5 118475 genbank_N66845 N66845 
104776 genbank^AA026349 AA026349 
104787 genbanKJVA027317 AA027317 
113702 genbankJ97307 T97307 
113938 genbanLW81598 W81598 

10 122635 genbanK_M454085 AA454085 
108407 genbankJWJ75519 AA075519 
108432 genbanLAA076626 AA076626 
108555 genbanleAA084983 AA084963 
101349 entre*_L77559 L77559 

15 124447 genbanK_N48000 N48000 
119071 genbanleR31180 R31180 
103520 entre?.Y10511 Y10511 
103663 genbank_Z78291 Z78291 
128046 877605 1 AA873285 AIG25762 

20 126959 546044 1 AA199853 AA206355 

123465 genbank_AA599033 AA599033 
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MISSING AT THE TIME OF PUBLICATION 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in Table 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: 

ExAccn: 

UnigenelD: 



R1: 



Unique Eos probeset identifier number 

Exemplar Accession number, Qenbank accession number 

Unigene number 

Unigene gene title 

Ratio of tumor to normal body tissue (Relaxed ratio (87/70) 



Pkey ExAccn 


UnigenelD Unigene Title 


R1 


131919 AA121266 


HS572458 ESTs 


272 


120328 AA196979 


Hs.290905 ESTs; Weakly similar to (defline not ava 


32.6 


101486 M24902 


Hs.1852 acid phosphatase; prostate 


252 


119073 R32894 


Hs£79477 ESTs 


243 


133428 M34376 


Hs.183752 microseminoprotein;beta- 


23.8 


128180 AA595348 


Hs.171995 kallikrein 3; (prostate specific antigen 


21.4 


104080 AA402971 


Hs57771 Homo sapiens mRNA for serine protease (T 


18.9 


127537 AA569531 


Hs.162859 ESTs 


16.6 


131665 R22139 


Hs.30343 ESTs 


174 


101050 K01911 


Hs.1832 neuropeptide Y 


17.3 


130771 N48058 


Hs.1 915 folate hydrolase (prostate-specific memb 


17 


107485 W63793 


Hs.262476 S-adenosylmethionine decarboxylase 1 


16.7 


106155 AA425309 


Hs.33287 ESTs 


16,5 


129534 R73640 


Hs.11260 ESTs 


164 


100569 HG2261-HT2351 


Antigen, 


101889 S39329 


Hs.181350 kallikrein 2; prostatic 


15.4 


135389 U05237 


Hs.99872 fetal Alzheimer antigen 


15 


133944 AA045870 


Hs.7780 ESTs 


125 


130974 X57985 


Hs.2178 H2B histone family; member Q 


113 


114768 AA149007 


Hs.182339 ESTs 


113 


104660 AA007160 


Hs.14846 ESTs 


11.4 


131061 N64328 


Hs.268744 ESTs; Moderately similar to KIAA0273 [H. 


10.9 


126645 AI167942 


Hs.61635 Homo sapiens BAG clone RQ041D11 from 7q2 10.7 


135153 N40141 


Hs.95420 Homo sapiens mRNA for JM27 protein; comp 10.6 


107033 AA599629 


Hs.113314 ESTs 


10.6 


118417 N66048 


ESTs; Weakly simBar to polymerase [H.sa 


105 


126758 W37145 


Hs.293960 ESTs 


102 


107102 AA609723 


Hs.30652 ESTs 


10.1 


116787 H28581 


Hs.15641 ESTs 


10.1 


115719 AA416997 


Hs59622 ESTs 


10 


123209 AA489711 


Hs2Q3Z70 ESTs 


9.9 


101664 M60752 


Hs.121017 H2A histone family; member A 


9.8 


112971 T17185 


Hs.83883 ESTs 


9.7 


117984 N51919 


Hs.106778 ESTs 


9.7 


129523 M30894 


Hs£74509 T-cell recepton gamma cluster 


9.4 


132964 AA031360 


Hs.167133 ESTs 


92 


121853 AA425887 


Hs.98502 ESTs 


9 


119617 W47380 


HS55999 ESTs 


8.9 


105627 AA281245 


Hs.23317 ESTs 


8.8 


101461 M22430 


Hs.76422 phospholipaseA2; group IIA (platelets; 


8.7 


124526 N62096 


Hs.293185 yz61c5.s1 Soares_muItiple.scleroslsJNbH 


85 


133845 T68510 


Hs.76704 ESTs 


8.2 


133354 AA055552 


Hs.334762 ESTs; Weakly similar to K1AA031 9 [H.sapi 


8.1 


119018 N95796 


HS278695 ESTs 


8 


100394 D84276 


Hs.66052 CD38 antigen (p45) 


8 


106579 AA456135 


Hs.23023 ESTs 


7,6 


114965 AA250737 


Hs.72472 ESTs 


7.4 


112033 R43162 


Hs.22627 ESTs 


7.1 


102398 U42359 


Human N33 protein form 1 (N33) gene, exo 


7 


101201 L22524 


Hs.2256 matrix metatloproteinase 7 (matrflysin; 


6.9 


101803 M86546 


Hs.155691 pre-B-oeit leukemia transcription factor 


63 


120562 AA280036 


Hs.302267 ESTs; Weakfy simitar to W01A6.c(C.etega 


65 



Prostate Specific AH Splice 16 
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109112 AA169379 Hs£57924 ESTs 6.8 

109795 F10707 Hs326416 ESTs 8.7 

130336 X0773O Hs.171995 kailikreln 3; (prostate specific antigen 6.6 

131425 AA219134 Hs26691 ESTs 6.6 

5 132902 AA490969 Hs.59838 ESTs 6.6 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 6.5 

120215 Z41050 Hs. 108787 Homo sapiens Mcd4p homolog mRNA; complet 6.5 

131881 AA010163 Hs.3383 upstream regulatory element binding prot 65 

100727 X07290 Hs334766 Human HF.12 gene mRNA 6.3 

10 121770 AA421714 HsJZ78428 Homo sapiens mRNA for KIAA0896 protein; 6.3 

123475 AA599267 Hs.250528 ESTs; Weakly similar to ANKYRIN; BRAIN V 6.3 

133061 AB000584 Ha.286638 prostate differentiation factor 6.3 

116429 AA609710 Hs.279923 ESTs; Weakty similar to similar to GTP-b 62 

101233 L29008 Hs.878 sorbitol dehydrogenase 6.2 

15 104891 AA011176 Hs.37744 ESTs 62 

127248 AA325029 EST27953 Cerebellum It Homo sapiens CDNA6.2 

105500 AA256485 Hs.222399 ESTs 6.1 

130828 AA053400 Hs.203213 ESTs 5.9 

115357 AA281793 Hs.72988 ESTs 5.8 

20 116334 AA491457 Hs.48948 ESTs 5.7 

120132 Z38839 Hs.125019 ESTs; Weakly similar to 111! ALU SUBFAMI 5.6 

106375 AA443993 Hs.289072 ESTs 5.6 

124777 R41933 Hs.140237 ESTs;WeaWy similar to neuronal thread 5:6 

101791 M83822 Hs.62354 Human beige-like protein (BGL) mRNA; par 55 

25 117698 N41O02 Hs.45107 ESTs 5.5 

122041 AA431407 Hs.98732 Homo sapiens Chromosome 16 BAC clone CIT 5.6 

133723 AA088851 Hs.262476 S-adenosylmethionine decarboxylase 1 5.5 

113938 W81598 ESTs 5.4 

133015 AA047036 Hs.246315 ESTs 5.4 

30 108186 AA056482 Hs.7780 ESTs 5.3 

104466 N25110 Hs.326392 Human guanine nucleotide exchange factor 5.3 

104033 AA365031 Hs.98944 ESTs 5.3 

110844 N31952 Hs.167531 ESTs;Weakfy similar to (deffine not ava 5,3 

129056 H70627 Hs.108336 ESTs; Weakly similar to Hll ALU SUBFAMI 5.3 

35 133493 AA284143 Hs.194369 Homo sapiens chromosome 1 atrophin-1 rel 5.3 

129184 W26769 Hs.109201 ESTs; Highly similar to (defline not ava 52 

101448 M21389 Hs.195850 keratin 5 (epidermolysis bullosa simplex 5.1 

116188 AA464728 Hs.184598 ESTs; Weakly similar to III! ALU SUBFAMI 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs.54416 sine ocuiis homeobox (Drosophfla) homolo 5.1 

128871 AA400271 Hs.106778 ESTs; Highly similar to (defline not ava 5.1 

116238 AA479362 Hs^7144 ESTs 5 

102913 X07696 Hs.80342 keratin 15 5 

103011 X52541 Hs.326035 early growth response 1 5 

45 118981 N93839 Hs.39288 ESTs; Weakly similar to till ALU SUBFAMI 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey 



CAT number Accession 



118417 37186 1 AF080229 AF080231 AF080230 AF080232 AF080233 AFQ80234 BE550633 AI636743AW614951 BE467547AI680833 

AI633818 N29986 U87592 1)87593 U87590 U 87591 S46404 U87587 AA463992 AW206802 AI970376 A1583718 AI672574 
N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AI214966 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 A1080480 AI6317Q3 AI651023 AI867418 AWB18140 AA502500 AI206199 A1671282 
AI352545 BE501030 AI652535 BE465762 AA206331 AW451866 AM71088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703398 H92278 AW139734 H92683 U87589 U87&5 H69001 U87594 BE466420 AIB24817 
BE466611 AI206344 AA574397 AA348354 Al 4931 92 
227560 1 AA364195 AA325029 AW962050 

235652*1 AI141999 AA730176 R44544 R41778 AW300793 AW966157 AA918501 AA599629 AI082185 AI198537 AW006520 

AW236663 AW151420 At826987 A1810832 AI669102 AI201981 N27331 AA335566 T84622 BE085347 BE085269 
entre^U42359 U42359 



127248 
107033 

102398 
113938 



genbanleW81598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey; Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: UnEgene number 

Unfgene Title: Unigene gene title 

R1: Ratio of tumor to normal body tissue 



Pkey ExAccn UnigenelD Unigene Title 



100131 D12485 Hs.11951 

100235 D29954 Hs.13421 
100570 HG2261-HT2352 
100819 HG4020-HT4290 

101083 L00354 Hs.80247 

101247 L33801 Hs.78802 

101416 M17254 Hs.279477 
101447 M21305 

101485 M24736 Hs.89546 

101514 M28214 Hs.123072 

101628 M57399 Hs.44 

101663 M60750 Hs2178 

101758 M77838 Hs.79217 

101768 M81118 Hs.78989 

101817 M88163 Hs.152282 

101888 M99701 K&95243 

102031 U04898 Hs2156 

102052 U07559 Hs.505 

102221 U24576 Hs.3844 

102233 U26173 Hs.79334 

102302 U33052 Hs.69171 

102348 U37519 Hs.87539 

102457 U48807 Hs£359 

102473 U49957 Hs.180398 

102669 U71207 HSJ29279 

102698 U75272 Hs.1867 

102751 U80O34 Hs.68583 

102823 U90914 Hs.5057 

102869 X02544 Hs.572 

103031 X54667 Hs.123114 

103043 X55733 Hs.93379 

103093 X60708 Hs.44926 

103376 X92098 Hs.323378 

103401 X95240 Hs54431 

103613 Z46629 Hs.2316 
103677 Z83806 

103962 AA298180 Hs.83243 

104084 AA410529 Hs.30732 

104257 AF006265 Hs.9222 

104301 D45332 Hs.6783 

104769 AA025887 Hs.293943 

104851 AA040882 Hs.10290 

104896 AA054228 HS-23165 

104956 AA074880 Hs.20509 

104957 AA074919 Hs.10026 
104967 AA084506 Hs.291000 
105099 AA150776 Hs.23729 
105298 AA233459 Hs.28389 



phosphodiesterase (/nucleotide pyrophosp 

KIAA0056 protein 

Hs.171995 

Hs.2387 

chofecystofdnin 

gtycogen synthase kinase 3 beta 
v-ets avian erythroblastosis virus E26 o 
Human alpha satellite and satellite 3 |u 
setecttn E (endothelial adhesion molecut 
RAB3B; member RAS oncogene family 
pteiotrophin (heparin binding growth fac 
H2B histone family; member A 
pyrro!ine-5-carboxytate reductase 1 

SWl/SNF related; matrix associated; actj 

transcription elongation factor A (Sll)- 

RAR-retated orphan receptor A 

ISL1 transcription factor; LIM/homeodoma 

UM domain only 4 

nucfear factor; interieaWn 3 regulated 

protein kinase Olike 2 

aldehyde dehydrogenase 8 

dual specificity phosphatase 4 

UM domain-containing preferred transloc 

eyes absent (Drosophila) homolog 2 

progastricstn (pepsinogen C) 

mitochondrial intermediate peptidase 

carboxypeptldase D 

orosomuooid 1 

cystatinS 

eukaryotic translation Initiation factor 
dipeptidytpeptidase iV (CD26; adenosine 
coated vesicle membrane protein 
specific granule protein (28 kOa); cyste 
SRY (sex-determining region Y>box 9 (ca 
H^apiens mRNA for axonemal dyneln heavy 
ESTs 
ESTs 

estrogen receptor-binding fragment-assoc 
ESTs 

ESTs; Weakly similar to III! ALU SUBFAMI 
U5 snRNP-specific 40 kDa protein (hPrpS- 
ESTs 

ESTs; Weakly similar to hypothetical pro 
ESTs; Weakly similar to ORF YJL063c [S.c 
ESTs 

Homo sapiens clone 24405 mRNA sequence 
ESTs 

121 



R1 

6.3 
5.1 

Antigen, Prostate Specific, Alt Splice 

Transglutaminase 10.5 

8S 

4.7 

4.7 

11 

9.8 

6.2 

8.4 

4.9 

5.4 

7.5 

55 

5.7 

132 

8.9 

5.6 

1A 

8.2 

5.9 

5.1 

5.7 

9 

10.6 

15.6 

4.9 

22.6 

47 

4.9 

5.8 ' 

5.2 

7.4 

5.2 

4.9 

6 

64 

6.8 

105 

6.3 

4.9 

5.8 

64 

43 

65 

7 

5.1 
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105304 AA233553 Hs. 190325 ESTs 4.7 

105370 AA236476 Hs.22791 ESTs; Weakly similar to transmembrane pr 105 

105427 AA251330 Hs.28248 ESTs 5 

105542 AA261858 Hs.266957 ESTs; WeaWy similar to heat shock prote 8.8 

5 105628 AA281251 Hs.79828 ESTs; Weakly similar to putative zinc fl 5.5 

105640 AA281623 Hs.6685 ESTs; Weakly similar to KIAA0742 protein 8 

105645 AA282138 Hs.11325 ESTs 14 

105691 AA287097 Hs.289068 transcription factor 4 6.3 

105730 AA292701 Hs5364 DKFZP5641052 protein 4.9 

10 105B08 AA393808 Hs286131 KIAA0438 gene product 7 

105826 AA398243 Hs, 194477 ESTs; Moderately similar to similar to N 5 

105903 AA401433 Hs.200016 ESTs; Weakly similar to diphospholnosito 9,9 

105906 AA401633 Hs.22380 ESTs 115 

106065 AA417558 Hs.25206 ESTs 5.1 

15 106094 AA419461 Hs.23317 ESTs 10.9 

106157 AA425367 Hs.34892 ESTs 6.6 

106184 AA426643 Hs.10762 ESTs 8.5 

106211 AA428240 Hs.126083 ESTs 8.4 

106213 AA428258 Hs.8769 Homo sapiens mRNA; cONA 0KFZp564E153 (fr 5.7 

20 106272 AA432074 Hs.323099 ESTs 55 

106369 AA443828 Hs.288856 ESTs 6.3 

106400 AA447621 Hs.94109 ESTs 5.4 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cDN A DKFZp564C053 (fr 9.2 

106507 AA452584 Hs.267819 .protein phosphatase 1; regulatory (Inhib 5.6 

25 106523 AA453441 Hs.31511 ESTs 47 

106532 AA453628 Hs.37443 ESTs 4.7 

106557 AA455087 Hs.22247 ESTs 5.7 

106575 AA456039 Hs.105421 ESTs 72 

106618 AA459249 Hs.8715 ESTs; Weakly similar to Similarity with 5.6 

30 106820 AA481037 Hs.12592 ESTs 5.4 

106846 AA485223 Hs.34892 ESTs 5.3 

106973 AA505141 Hs.11923 Human DNA sequence from clone 167A19 on 75 

107110 AA609952 Hs.12784 KIAA0293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AA621340 Hs.10600 ESTs; Weakly similar to ORF YKR081C [S,c 5.2 

107217 D51095 Hs.35861 DKFZP586E1621 protein 15.1 

107365 U78294 Hs.1 1 1256 arachidonate 15-llpoxygenase; second typ 4.7 

107630 AA0O7218 Hs.60178 ESTs 55 

107734 AA016225 Hs.7517 ESTs 45 

40 107760 AA018042 Hs252085 EST 75 

107997 AA037388 Hs52223 Human DNA sequence from clone 1 41 H5 one 105 

106012 AA039616 Hs.173334 ESTs 65 

108520 AA084138 Hs.46786 ESTs 7.9 

108583 AAD88276 HS58826 ESTs 5.6 

45 108613 AA100967 Hs.69165 ESTs 6 

108664 AA113349 Hs.69588 EST 65 

108677 AA115629 Hs.118531 ESTs 5.9 

108807 AA129968 Hs.49376 ESTs; Weakly similar to PROTEIN PHOSPHAT 55 
108910 AA136590 ESTs 5 

50 108933 AA147224 Hs537232 ESTs 12.7 

108948 AA149579 HS.11B258 ESTs 6.8 

109014 AA156790 Hs.262036 ESTs 155 

109124 AA171529 Hs.183887 ESTs 6.1 * 

109142 AA176438 Hs.41295 ESTs 5.1 

55 109277 AA196332 Hs56043 ESTs 55 

109342 AA213620 Homo sapiens mRNA; cDNA DKFZp586M1418 (f 6 

109562 F01811 Hs.187931 ESTs; Moderately simBar to voltage-gate 105 

109565 F01930 Hs.23648 ESTs 7 

109648 F04600 Hs.7154 ESTs 9.9 

60 109799 F10770 Hs.180378 Homo sapiens clone 669 unknown mRNA; com 6.4 

109859 H02308 Hs.20792 ESTs 5.3 

110181 H20276 Hs.31742 ESTs 16.8 

110854 N32919 Hs27931 ESTs 10 

110924 N47938 Hs.12940 yy84a09.s1 Soaresjnuitfple_scIerosIs_2Nb 55 

65 111046 N55514 Hs.318584 ESTs 6.9 

111091 N59858 Hs.33032 Homo sapiens mRNA; cDNA DKFZp434N185 (fr 52 

111157 N66613 Ks.89364 ESTs 5 

111164 N66857 Hs.122489 ESTs; WeaWy similar to III! ALU CLASS C 55 
111221 N68869 Hs.15119 ESTs 62 
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111348 N90041 Hs.9585 ESTs 5.4 

111353 N90430 Hs.6616 ESTs 53 

111495 R07210 Hs.9683 ESTs 5.8 

111540 R08850 Hs.9786 ESTs 6 

5 111579 R10857 Hs.167115 KIAA0830 protein 123 

111581 R10684 Hs.5794 ESTs 7.1 

111734 R25375 Hs.128749 ESTs 62 

111861 R37460 Hs.25231 ESTs 9.4 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 6.5 

10 111937 R40431 Hs.14846 Homo sapiens mRNA; cDNA DKFZp564D016 (fr 4.8 

111987 R42036 Hs.6763 KIAA0942 protein 6.4 

112184 R49173 Hs330242 ESTs 5.6 

112286 R53765 Hs.158135 KIAA0981 protein 9.3 

112380 R59740 Hs3740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 Hs.78225 annextnAI 5.4 

112753 R93696 Hs.169882 ESTs 5.8 

112902 T09262 Hs.129190 ESTs 5.1 

112984 T23457 H&289014 ESTs 4.9 

20 113021 T23855 Hs.129836 KIAA1028 protein 103 

113083 T40530 Hs266957 ESTs; Weakly similar to heat shock prote 5.7 

113200 T57773 Hs.10263 ESTs 73 

113494 T88878 Hs36538 ESTs 8.7 

1 13849 W60439 K&8858 ESTs; Moderately similar to cbp146 [Mjnu 4.9 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysterold dehydro 4.7 

113950 W85765 Hs3Q504 Homo sapiens mRNA; cDNA DKf=Zp434E082 (fr 6.7 

113986 W87462 Hs21894 ESTs 5.9 

113989 W87544 Hs268B28 ESTs 4.7 

114124 Z38595 Hs.125019 ESTs; Highly similar to KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 9.6 

114346 Z41450 Hs. 130489 ESTs 5.2 

114435 AA018216 Hs. 164975 Bteaudal D (DrosophBa) homolog 1 7.4 

114463 AA025370 Hs.40109 KIAA0872 protein 8.2 

114652 AA101416 Hs.1 07149 ESTs; Weakly similar to PTB-ASSOCIATED S 5.4 

35 114721 AA131450 Hs.103822 ESTs 4.8 

114730 AA133527 Hs331328 ESTs; Weakly slmflar to TTieKIAAOl 38 gen 5.1 

114833 AA234362 Hs.87159 ESTs; Moderately similar to CGI-66 prote 5.5 

114860 AA235112 Hs.42179 ESTs; Moderately similar to similar to m 6.3 

114884 AA235811 Hs.293672 ESTs 5.2 

40 114895 AA236177 Hs.76591 KIAA0887 protein 4.7 

114908 AA236545 Hs.54973 ESTs 52 

114932 AA242751 Hs.16218 KIAA0903 protein 5.7 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 52 

115140 AA258030 Hs.279938 ESTs; Weakly similar to supported by GEN 5,9 

45 115468 AA287061 Hs.48499 ESTs; Highly similar to Bdeight protein 4.7 

115583 AA398913 Hs.45231 LDOC1 protein 7.6 

115709 AA412519 Hs38279 ESTs 43 

115772 AA423972 Hs.131740 ESTs 5 

1 15774 AA424G29 Hs.288390 ESTs; Moderately similar to dynamln; int 5 A 

50 115776 AA424038 Hs.81897 ESTs 5 

115821 AA427528 Hs.130985 ESTs; Weakly similar to ZINC FINGER PROT 13.7 

115955 AA446121 Hs.44198 Homo sapiens BAG clone RG054D04 from 7q3 103 

116024 AA451748 Hs33883 Human DNA sequence from clone 71 8J7 one 63 - 

116108 AA457566 Hs.28777 ESTs 6 

55 116117 AA459117 Hs31575 SEC63; endoplasmic reticulum translocon 7.3 

116146 AA460701 Hs.15423 ESTs 53 

116296 AA489033 Hs.62601 Homo sapiens mRNA; cDNA DKFZp588K131 8 (f 5.7 

116379 AA521472 Hs.71252 ESTs 5.9 

116393 AA599463 Hs.306051 protein phosphatase 2 (formerly 2A); reg 5.9 

60 116401 AA599963 Hs39698 ESTs 7.9 

116416 AA609219 Hs.39982 ESTs 9.2 

116587 059325 Hs.121429 ESTs 52 

116601 D80O55 H 3.45 140 ESTs 4.9 

116684 F09156 Hs.66095 ESTs 72 

65 116722 F13654 HSFIH32 Stratagene cat#937212 (1992) Horn 5.5 

116766 H13260 Hs.95097 ESTs ' 5.9 

1 17453 N29568 Hs.108319 thyroid hormone receptor-associated prot 63 

117557 N33920 Hs.44532 diubiquitin 43 

117708 N45114 Hs.126280 ESTs 63 
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118001 N52151 Hs.47447 ESTs 114 

118229 N62339 Hs. 166254 heat shock 90kD protein 1; alpha 6.2 

118599 N69207 Hs.203697 ESTs 5.8 

118845 N70358 Hs.125180 growth hormone receptor 7.1 

5 118873 N89881 HS44577 ESTs 6 

118985 N94303 Hs.55028 ESTs 9.3 

119107 R42424 Hs.63841 ESTs 6 

119126 R45175 Hs.117183 ESTs 17.9 

119271 T16387 Hs.65328 ESTs 8 

10 119367 T78324 Hs.250895 ESTs 5 

119721 W69440 Hs.48376 ESTs 15.4 

119741 W70205 Ks.43670 Wnesin family member 3A 10.1 

119780 W72987 Hs.191381 ESTs; Weakly similar to hypothetical pro 5.3 

120217 Z41078 Hs.66035 ESTs 4.8 

IS 120266 AA173939 Hs2Q5442 ESTs; Weakly similar to inner centromere 8.8 

120294 AA190888 Hs.153881 ESTs; Highly similar to NY-REN-62 antige 4.9 

120418 AA236010 Hs.26613 Homo sapiens mRNA; cONA DKFZp586F1323 (f 4.7 

120486 AA253400 Hs.137569 tumor protein 63 kDa with strong homolog 5.6 

120524 AA261852 Hs. 192905 ESTs 4.9 

20 120571 AA280738 Hs.34892 ESTs 8.8 

120596 AA282074 Hs237323 ESTs 65 

120713 AA292655 Hs.96557 ESTs 9.9 

120992 AA398246 Hs.97594 ESTs 164 

121429 AA406293 Hs.41167 ESTs 6.9 

25 121503 AM12049 HS290347 ESTs 7.6 

121512 AA412105 Hs.193736 ESTs 5.8 

121816 AA424814 H$A8W ESTs 4.6 

122027 AA431302 Hs.98721 EST; Weakly similar to N-coplne [H.saple 5.6 

122294 AA437311 Hs.98927 ESTs 5.7 

30 122411 AA446859 Hs.99083 ESTs 65 

122791 AA460158 Hs.129836 KIAA1 028 protein 12.4 

122792 AA460225 Hs.99519 ESTs 5.1 
122969 AA478539 Hs.1 04336 ESTs 45 
123095 AA485724 Hs27413 ESTs 54 

35 123100 AA485957 Hs.306219 Homo sapiens clone 25032 mRNA sequence 5 

123295 AA495981 Hs250830 ESTs 4.7 

123311 AA496252 Hs.105069 ESTs 74 

123583 AA609006 Hs.1 11240 ESTs 9.1 

123619 AA609200 ESTs 4.7 

40 123645 AA609310 Hs.188691 ESTs 4.8 

123709 AA609651 Hs.1 12742 ESTs 7 

123968 C14333 Hs.108327 damage-specific DNA binding protein 1 (1 5 

• 124178 H45996 Hs.97101 putative G protein-coupled receptor 6.8 

124352 N21626 Hs.102406 ESTs 102 

45 124357 N22401 yw37g07.s1 Morton Fetal Cochlea Homo sap 10.6 

124515 N58172 Hs.109370 ESTs 142 

124911 R88992 Hs.174195 ESTs 4.8 

125154 W38419 ESTs 4.7 

125992 W01626 za36eQ7.r1 Soares fetal Twer spleen 1NF 5.1 

50 126802 AA947601 Hs.97056 ESTs 5.1 

126812 Z36290 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1 4.6 

127080 AA682913 Hs.190173 ESTs 5 

127308 AA507628 Hs.334390 ESTs 4.8 * 

127370 AI024352 Hs.70337 immunoglobulin superfamSy; member 4 4.7 

55 127388 AI457411 HS.10672B ESTs 4.8 

127965 AAS28760 H$292059 ESTs 4.8 

128172 AI4O0862 Hs265130 ESTs 5 

128305 AI039722 Hs279009 ESTs 5.8 

128420 AI088155 Hs41296 ESTs; Weakly similar to unknown [H.sapie 17 

60 128467 AA176446 Hs.180428 ESTs; Weakly similar to hypothetical 43. 4.8 

128610 L38608 Hs.10247 activated leucocyte cell adhesion molecu 7.9 

128625 AA242616 Hs.1 02652 ESTs; Weakly similar to KIAA0437 [H.sapi 8.1 

128651 AA446990 Hs. 103135 ESTs 6.5 

129088 AA215971 Hs.194431 KIAA0992 protein 52 

65 129136 N26391 Hs250723 ESTs 5.1 

129171 AA234048 Hs.7753 calumenln 5.8 

129229 AA211941 Hs.109643 polyadenyiate binding protetn-lnteracOn 5.8 

129386 N27524 Hs260024 Cdc42 effector protein 3 52 

129467 AA410311 Hs.44208 ESTs 5.1 
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129564 H22136 HsJ5295 guanylate cyclase 1; soluble; alpha 3 16.3 

129699 AA458578 Hs.12017 KIAA0439 protein; homotog of yeast ublqu 9.2 

129821 F11019 Hs.12696 cortactin SH3 domain-binding protein B.6 

129823 X00948 Hs.105314 relaxin2(H2) 9.1 

5 129847 W46767 Hs296178 ESTs; Weakly similar to RNA POLYMERASE I 5.4 

129912 AA047344 Hs.107213 ESTs; Highly similar to NY-REN-6 antigen 6.5 

129958 L20591 Hs.1378 annexinA3 5.1 

129977 J04076 Hs.1395 early growth response 2 (Krox-20 (Drosop 8.6 

130061 U82256 Hs.172851 arginase; type II 7 A 

10 130241 U78313 Hs.153203 MyoD family inhibitor 4.9 

130466 N21679 Hs.180059 ESTs 5.8 

130541 X05608 Hs.21 1584 neurofilament; light polypeptide (68kD) 6.7 

130619 AA477739 Hs.12532 ESTs 6.4 

130925 N71935 Hs.169378 multiple PDZ domain protein 7.9 

15 130938 AA013250 H&21398 ESTs; Moderately similar to PUTATIVE QLU 6.2 

130971 H20332 H&301444 signal sequence receptor; gamma (translo 6.4 

131066 FO9O06 H&22588 ESTs 5 

131126 1=09012 Hs.181326 myotubutarin related protein 2 6.4 

131310 J02960 H&2551 adrenergic; beta-2-; receptor, surface 7.9 

20 131487 AA253220 H&27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 5.9 

131561 X59841 Hs.294101 pre-B-ceH leukemia transcripfon factor 7.6 

131562 U90551 Hs£6777 H2A histone family; member L 5.1 
131579 N62922 Hs£9088 ESTs 11 
131629 AA442119 H&238809 ESTs 4.9 

25 131682 AA428368 Hs.30654 ESTs 4.8 

131699 R68657 Hs.90421 ESTs; Moderately similar to !!!! ALU SUB 6.5 

131795 N32724 Hs.32317 Sox-like transcriptional factor 5.6 

132053 H93381 H&38085 ESTs; Weakly similar to putative glycine 72 

132122 U65092 Hs.40403 Cbp/p300-interacting transactrvaton wit 5.6 

30 132191 AA449431 Hs^88361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bml-1) oncogene h 55 

132482 AA429478 Hs.238126 ESTs; Highly similar to CGI-49 protein [ 6.6 

132533 AA021608 Hs.172510 ESTs 5.8 

132572 AA448297 Hs£37825 signal recognition particle 72kD 6.2 

35 132581 R42266 Hs.52256 ESTs; Weakly similar to beta-TrCP protel 16 

132700 N47109 Hs5521 ESTs 6.8 

132701 AA279359 Hs.55220 BCL2-assoclated aihanogene 2 5.3 
132725 L41887 Hs.184167 splicing factor; arglnlne/serine-rich 7 7.8 
132783 N74897 Hs278894 DEAD/H (Asp-GIu-AIa-Asp/His)boxpoIypep 5.9 

40 132790 X75535 Hs,1 68670 peroxisomal famesylated protein 8 

132939 U76189 Hs.61152 exostoses (multiple)-like 2 5.2 

133142 F03321 Hs.65874 ESTs 5.2 

133342 U29589 Hs.7138 cholinergic receptor; muscarinic 3 105 

133434 AA278852 Hs.30212 ESTs 5.8 

45 133453 M68941 Hs.73826 protein tyrosine phosphatase; non-recept 4.9 

133520 X74331 Hs.74519 prlmase; polypeptide 2A (58kD) 13.1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; receptor t 4.6 

133608 013315 Hs.75207 gtyoxalasel 4.8 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKFZp586M141 (fr 5 

50 133633 D21262 Hs.75337 nucleolar phosphoprotein p1 30 6.3 

133797 S66431 Hs.76272 retinoblastoma-blnding protein 2 6 

133928 N34096 Hs.7766 ubiqultin-oonjugating enzyme E2E 1 (homo 5.4 

134095 U47414 H9.79069 cydlnQ2 52 - 

134249 N89827 Hs.80667 RALBP1 associated Eps domain containing 6.5 

55 134321 AA418230 Hs.8172 ESTs 7 

134453 X70683 Hs.83484 SRY (sex determining region Y)-box 4 4.7 

134542 X57025 Hs.85112 insuiin-like growth factor 1 (somatomedi 7.7 

134570 U66615 Hs.172280 SWI/SNF related; matrix associated; acti 6.4 

134592 U82613 Hs.289104 Alu-blnding protein with zinc finger dom 5.4 

60 134654 W23625 Hs.B739 ESTs; Weakly similar to ORF YGR200C [S.c 5 
134666 AA482319 Hs.8752 putative type II membrane protein 6.4 
134806 Z49099 Hs.89718 spermine synthase 6.7 
134951 AA431480 Hs.169358 ESTs 9.8 
135066 X04602 Hs.93913 Interfeuktn 6 (interferon; beta 2) 5.7 

65 135155 AA358268 Hs.166556 ESTs; Moderately similar to transcriptio 4.9 
135411 L10333 Hs.99947 retlculonl 5.3 
300023 M10098 AFFX control: 1 BS ribosomal RNA 4.6 

300254 AW079607 Hs£5610 ESTs; Weakly similar to ZnT-3 {H^aplens 7.8 
300273 AW013907 Hs.167531 ESTs; Moderately similar to predicted us 115 
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300319 AW157646 Hs. 153506 ESTs; Weakly similar to mtaotubula-acti 8,5 

300568 H86709 Hs*26392 son of sevenless (Drosophlla) homolog 1 5.8 

300578 A1989417 Hs.134289 ESTs 4.4 

300671 AI239706 Hs.93810 ESTs 7.9 

5 300675 AA039352 Hs.125034 ESTs; Weakly similar to ORF YDLO40c (S.c 4,5 

300680 AW468066 Hs.24817 ESTs; Weakly similar to KIAA0986 protein 52 

300762 AI497778 Hs.20509 ESTs 6.4 

300810 AI076890 Hs.146847 ESTs 5.8 

3X813 AM06411 Hs.208341 ESTs; Weakly similar to KIM09B9 protein 10.6 

10 300823 AI863068 Hs.106623 ESTs; Weakly similar to putative zinc fi 5.6 

300834 AF109300 Hs.147924 ESTs 6.7 

300923 AW136372 Hs.1852 ESTs 7.6 

300962 AA593373 Hs.293744 ESTs 5.5 

301015 AA947682 Hs.20252 ESTs; Weakly similar to Chain A; Cdc42hs 7 

15 301042 AI659131 Hs.197733 ESTs 24.9 

301242 AW161535 Hs.23782 ESTs 11* 

301254 AI049624 Hs.283390 EST duster (not in UniGene) with exon h 4.3 

301262 H29500 Hs.7130 ESTs; Modarately similar to N-copIne [H. 4.3 

301388 AA156879 Hs£62Q36 ESTs; Weakly similar to ZINC FINGER PROT 6.6 

20 301563 AI802946 Hs.44208 ESTs; Weakly similar to match to ESTs AA 5.7 

301656 AW008475 Hs.151258 EST cluster (not In UniGene) with exon h 6.8 

301689 Z44810 Hs.301789 ESTs; Weakly similar to similar to C.ele 6.3 

301783 AL046347 Hs*3937 Homo sapiens PAC clone DJ1 159004 from 7p 62 

301805 A1800004 Hs.142846 ESTs; Weakly similar to MesP1 [Mjnusculu 8.5 

25 301846 R20002 Hs.6823 ESTs; Weakty similar to intrinsic factor 4.6 

301891 AF131855 H&279591 Homo sapiens clone 25056 mRNA sequence 6.3 

302005 AI869666 Hs.123119 ESTs 36.8 

302056 AI457532 Hs.30488 ESTs; Moderately similar to ROSA26AS [M. 9.5 

302067 H05698 Hs.222399 ESTs; Weakly similar to protetn-tyroslne 5.8 

30 302099 AL021397 Hs.1 37576 rtbosoma! protein L34 pseudogene 1 8.8 

302147 AB022660 Hs.151717 KIAA0437 protein 5.9 

302214 AJ001454 Hs.159425 Homo sapiens mRNA for te$ttean-3 4.3 

302236 AI128606 Hs.6557 zinc finger protein 161 4.3 

302358 081150 Hs.322848 EST cluster (not in UniGene) with exon h 5.5 

35 302410 N^.004917 H&218368 EST cluster (not In UniGene) with exon h 26,8 

302486 AC003682 Hs. 183512 multiple UniGene matches 8.2 

302582 NMJJ00522 Hs249195 EST cluster (not In UniGene) with exon h 6.4 

302785 AA425562 Hs.1 1065 EST cluster (not in UniGene) with exon h 5 

302792 AA343696 Hs.46821 ESTs; Weakly similar to putative [H.sapl 4.8 

40 302881 AA508353 Hs.105314 relaxin 1 (H1) 78* 

302892 N58545 Hs.42346 histone deacetylase 3 8* 

302970 AW1 18352 Hs.312679 EST cluster (not In UniGene) with exon h 7.4 

302977 AW263124 Hs.315111 EST cluster (not In UniGene) with exon h 5.5 

303029 AF199613 EST cluster (not In UniGene) with exon h 4.6 

45 303125 AF161352 Hs.111782 EST cluster (not In UniGene) with exon h 5.8 

303280 AI571580 Hs.170307 ESTs 4.3 

303306 AA215297 Hs.61441 EST cluster (not In UniGene) with exon h 6.4 

303309 AL134164 Hs.145416 ESTs 6.6 

303344 AA255977 Hs.250646 ESTs; Highly similar to ubiqultm-conjug 19* 

50 303380 AA298471 Hs.326567 EST cluster (not fn UniGene) with exon h 6.6 

303401 AA758552 Hs*09497 ESTs 6.8 

303525 AW516519 Hs.273294 ESTs 4.8 

303526 AA348111 Hs.96900 ESTs 12.1 - 
303540 AA355607 Hs*09490 ESTs; Weakly similar to MMSET type I [H. 8.2 

55 303572 AW338520 Hs.242540 ESTs 8.4 

303685 AW500106 Hs.23643 EST cluster (not In UniGene) with exon h 4.9 

303699 D30891 Hs.19525 EST cluster (not In UniGene) with exon h 15.7 

303702 AW500748 Hs.224961 ESTs; Weakly similar to 73 kDA subuntt o 6* 

303718 AI741397 Hs.1 14658 ESTs 4.6 

60 303722 AA521510 Hs.145010 ESTs 12* 

303732 AW502405 Hs.125759 ESTs; Weakly similar to tumor suppressor 4.3 

303735 AA707750 Hs.169055 ESTs; Weakly similar to ds-Goigl matrix 5.4 

303752 A1017286 Hs*957 EST cluster (not In UniGene) whh exon h 5.3 

303753 AW503733 Hs.9414 ESTs 13 
65 303813 AI275850 Hs.1 14658 EST cluster (not In UniGene) with exon h 7.8 

304053 R00493 Hs.1 25565 translocase of inner mitochondrial membr 4* 

304218 N66373 Hs.27973 ESTs; Weakly similar to ZK354.7 [Celega 6 

305200 AA668128 Hs.45207 EST singleton (not In UniGene) with exon 5.7 

306716 AI024916 Hs251354 ESTs 5.7 



126 



WO 02/30268 



PCT/US01/32045 



307848 AI364186 EST singleton (not In UniGene) with exon 7.3 

307871 AI368665 Hs.31476 EST singleton (not In UniGene) with exon 5.4 

308050 A1460004 Hs.31608 EST singleton (not In UniGene) with exon 8.1 

308362 AI613519 Hs.105749 EST singleton (not In UniGene) with exon 55 

5 308923 A1863Q51 Hs.278815 ESTs 4.4 

309116 AI927149 Hs.29787 ribosomal protein L10 4.5 

309375 AW075342 Hs.9271 EST singleton (not In UniGene) with exon 7.4 

309674 AW205604 Hs.266009 ESTs; Weakly similar to III) ALU SUBFAMI 5 

310095 AI921750 Hs.144871 ESTs 5 

10 310098 AI685841 Hs.161354 ESTs 11J6 

310250 A1478829 Hs,158465 ESTs 5.8 

310365 AI262148 Hs.145569 ESTs 9.7 

310382 A1734009 Hs. 127699 EST cluster (not In UniGene) 104 

310409 AI612775 Hs.145710 ESTs 4.6 

15 310431 A1420227 Ks.149358 ESTs 725 

310573 AW292180 Hs.156142 ESTs 7.6 

310598 A1338013 Hs.140546 ESTs 92 

310539 AW269082 Hs.175162 ESTs 4.5 

310787 AW262580 Hs.147674 ESTs 4.9 

20 310816 AI973051 Hs.224965 ESTs 7.6 

311251 AI655662 Hs.197698 ESTs 413 

311280 AI767957 Hs.198248 ESTs; Weakly similar to Y38A8.1 gene pro 45 

311330 AI679524 Hs .20 1629 ESTs; Moderataty similar to lit! ALU SUB 4.6 

311515 AW136713 Hs.23862 ESTs 5.9 

25 311574 AI824863 Hs.211420 ESTs 4.8 

311587 AI828254 Hs.271019 ESTs 5.8 

311596 AI682088 Hs.79375 ESTs 264 

311631 AI809519 . Hs.27133 ESTs 6.4 

311688 AW025661 Hs.240090 ESTs 7.4 

30 311783 A1682478 Hs.13528 EST 4.6 

311826 AA765470 Hs.85092 ESTs 6.7 

311853 AW014013 Hs.107056 ESTs 5.3 

311901 R16890 Hs.137135 ESTs 5.6 

311932 AW451654 Hs.257482 ESTs 4.3 

35 312153 AA759250 Hs.118625 cytochrome b-561 11 

312182 AA83480O Hs.326263 EST cluster (not in UniGene) 16.9 

312242 AI38Q207 Hs.125276 ESTs 4.7 

312298 C01367 Hs.127128 ESTs 5.3 

312407 R46180 Hs.153485 ESTs 62 

40 312424 AA847398 Hs291997 ESTs 4.8 

312425 R49353 H&293892 ESTs 52 

312480 R88651 Hs.144997 ESTs 95 

312518 C17785 Hs.182738 ESTs 63 

312521 AA033609 HS239884 ESTs 112 

45 312527 AI695522 Hs.191271 ESTs 4.7 

312539 AI004377 HS200360 ESTs 7 

312546 AI623511 Hs.1 18567 ESTs 5.1 

312563 AA976064 Ha 180342 ESTs 65 

312623 AA694607 Hs.176956 EST duster (not In UniGene) 103 

50 312857 M772279 Hs.126914 ESTs 5 

312890 AI813654 Hs.5957 ESTs 5.8 

312903 AA939266 Hs.278626 ESTs 7.7 

312905 H92571 Hs234478 ESTs 65 - 

312976 AA836271 Hs.125830 ESTs 4.6 

55 312983 A1079278 Hs.269899 ESTs 5.1 

312996 AA249018 Hs.154331 EST cluster (not in UniGene) 7 

313035 N36417 Hs.144928 ESTs 6.3 

313166 AI801098 Hs.151500 ESTs 4.3 

313188 AI039702 Hs.179573 collagen; type I; alpha 2 4.8 

60 313218 AA827805 Hs.124296 ESTs 5 

313226 AI20G281 Hs.123910 ESTs 5.9 

313325 A1420611 Hs.127832 ESTs 4.6 

313326 A108812O Hs.122329 ESTs 7,4 
313425 AA745689 Hs.186838 ESTs; Weakly similar to similar to zinc 6.3 

65 313499 AI261390 Hs.146085 ESTs 5.6 

313540 AI797301 Hs.5740 ESTs 5.9 

313568 AW467376 Hs.129640 ESTs 4.3 

313569 AI273419 Hs.135146 ESTs; Weakly similar to ZK1G58<5 [C.eteg 4.6 
313603 AW468119 Hs.287631 EST cluster (not UniGene) 8.8 
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313615 AW295194 


Hs.301997 DKFZP4S4N126 protein 


62 


313625 AW466402 


Hs254020 ESTs 


73 


313634 AA688292 


Hs.337786 ESTs 


44 


313635 AA507227 


Hs.6390 ESTs 


8.1 


313638 AI753075 


Hs.104627 ESTs 


6.7 


313670 C16690 


Hs.23767 EST cluster (not fn UniGene) 


4.4 


313671 W49823 


Hs.104613 ESTs 


4.4 


313676 AA861697 


Hs.120591 EST cluster (not fn UniGene) 


134 


313703 AI161293 


Hs.280380 ESTs; Weakly similar to KIAA0525 protein 


10 


313712 AA763553 


Hs.74170 ESTs 


52 


313800 AW296132 


HS55098 ESTs 


5.4 


313979 A1535895 


Hs221Q24 ESTs 


4.3 


314121 AI7321Q0 


Hs.187619 ESTs 


13.6 


314123 AW245993 


Hs223394 ESTs 


6.4 


314171 AI821895 


Hs.193481 ESTs 


294 


314188 AL138431 


Hs.164243 ESTs 


4.6 


314219 AL036001 


Hs.48376 ESTs 


5.7 


314236 AA743396 


Hs.169023 ESTs 


4.9 


314237 M732359 


Hs.96264 ESTs 


4.4 


314284 AA731431 


H&293464 EST cluster (not In UniGene) 


64 


314305 A1280112 


Hs.125232 ESTs 


5.3 


314343 AI754701 


Hs.32B476 ESTs; Weakly similar to alternatively sp 


62 


314530 AI052358 


Hs.193726 ESTs 


45 


314691 AW207206 


Hs.136319 ESTs 


17 


314695 AW502698 


Hs.1 18152 ESTs 


8.9 


314785 AI538226 


Hs.32976 ESTs 


9.4 


314801 AA481027 


Hs.109045 ESTs; Weakly similar to ORF YGR245c [S.c 


8 


314864 AA493811 


Hs.294068 ESTs 


6 


314807 A1672225 


Hs.222888 ESTs 




314916 AA548906 


Hs.122244 ESTs 


4.5 


314954 AA521381 


Hs.187726 ESTs 


5.3 


3T4981 AA524953 


Hs.293334 ESTs 


4.6 


315021 AA533447 


Hs.312989 EST duster (not In UniGene) 


5.1 


315051 AW292425 


Hs.163484 EST 


15.5 


315052 AA876910 


Hs.134427 ESTs 


20 


315073 AW452948 


Hs.257631 ESTs 


5.3 


315084 AI821085 


ESTs 


82 


315214 A1915927 


Hs.34771 ESTs 


54 


315220 AI420753 


Hs.66731 ESTs 


5.1 


315278 A1935544 


Hs.12450 ESTs 


5.8 


315282 A1222165 


Hs.144923 ESTs 


4.5 


315368 AW291563 


Hs.104696 ESTs 


6 


315369 AA764918 


Hs.256531 ESTs 


4.8 


315378 A1263393 


Hs.145008 ESTs 


62 


315379 AI378329 


Hs.126629 ESTs 


5.4 


315402 AW293424 


HS75354 ESTs 


5.1 


315442 AA977935 


Hs.127274 ESTs 


6.6 


315443 AW003416 


Hs.160604 ESTs 


5.5 


315528 R37257 


Hs.184780 ESTs 


8.1 


315593 AW198103 


Hs.158154 ESTs 


9.9 


315634 AA837085 


Hs.220585 ESTs 


7.8 


315705 AW449285 


Hs.313636 ESTs 


6.9 


315707 AJ418055 


Hs.161160 ESTs 


5.1 • 


315714 AA744015 


Hs298138 EST cluster (not in UniGene) 


6.1 


315740 T05558 


Hs.156880 EST cluster (not in UniGene) 


6.8 


315762 AI391470 


Hs.158618 ESTs 


5.3 


315769 AA744875 


Hs.189413 ESTs 


5 


315843 AA679430 


Hs.191897 ESTs 


5.7 


315990 AI800041 


Hs.190555 ESTs 


9.2 


316012 AA764950 


Hs.1 19898 ESTs 


4.3 


318036 AA708016 


Ha.190389 ESTs 


5.9 


316055 AA693860 


Hs.6947 EST cluster (not in UniGene) 


6.7 


316074 AW517542 


Hs.293273 ESTs 


55 


316100 AW203986 


Hs.213003 ESTs 


5.1 


316169 A1127483 


Hs.120451 ESTs 


82 


316442 AA760894 


Hs.153023 ESTs 


17.1 


316491 AA766025 


Hs.186854 EST 


4.6 


316504 AW135854 


Hs.132458 ESTs 


4.3 


316667 AW015940 


Hs232234 ESTs 


7.6 
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316854 AAS31215 Hs. 159066 ESTs; Weakfy sfmllar to predicted using 5.1 

316905 AW138241 Hs.210848 ESTs 6.4 

317008 AW051597 Hs.143707 ESTs 4.4 

317019 AA864968 Hs.127699 ESTs 11 

5 317194 AW445I67 Hs.126036 ESTs 13.5 

317224 D56760 Hs.93029 ESTs 8.7 

317404 AI806867 Hs.126594 ESTs 8.7 

317501 AA931245 Hs.137097 ESTs 11.1 

317548 AI654187 Hs.195704 ESTs 14.2 

10 317651 AW292779 Hs.169799 ESTs 5.8 

317758 AI733277 Hs.128321 ESTs 5.4 

317850 N29974 Hs.152982 EST cluster (not in UniGene) 114 

317869 AW285184 Hs.129t42 ESTs;WeaWysfmiiartoDEOXYRtBONUCLEAS 13.8 

317902 AI828602 Hs.211265 ESTs 5.3 

15 317916 AI565071 Hs.159983 ESTs 7.7 

318239 AI085198 Hs.164226 ESTs 13.1 

318268 AI817736 Hs.182490 ESTs 62 

318327 AW294013 Hs.200942 ESTs 4.6 

318363 R45530 Hs.1440 gamma-amInobutyricacid{GABA) Arecepto 6 

20 318428 A1949409 Hs.194591 ESTs 123 

318464 A1151010 Hs/I57774 ESTs 4.3 

318524 AW291511 Hs.159066 ESTs 25,9 

318540 T30280 Hs.274803 EST cluster (not in UniGene) 7 

318591 AW206806 Hs.1 15325 ESTs 4.8 

25 318615 AI133817 Hs.10177 ESTs 5.5 

318646 AW175665 Hs.278695 ESTs 5.7 

318667 AI493742 Hs.165210 ESTs 11 

318668 W26276 Hs.136075 ESTs 5.9 
318753 AA578265 Hs.7130 copinelV 5.5 

30 319080 Z45131 Hs.23023 ESTs 16.9 

319181 F06504 Hs.27384 EST cluster (not In UniGene) 4.6 

319191 AF071538 Ks.79414 prostate epithelium-specific Ets transcr 6.6 

319233 R21054 H&.180532 ESTs 4.9 

319586 078808 Hs.283683 ESTs 8.2 

35 319750 AA621606 Hs.1 17956 ESTs 95 

319763 AA460775 Hs.6295 ESTs 143 

319824 AA424266 Hs.123642 EST cluster (not in UniGene) 12.8 

319838 AA337642 Hs.95262 nuciear factor related to kappa Bblndfn 5.1 

319913 AA179304 Hs.271586 ESTs; Moderately similar to U!1 ALU SUB 4.3 

40 319964 T80579 H&29Q270 ESTs 5.8 

320076 A1653733 Hs.271593 ESTs 8.5 

• 320102 AW296219 Hs.115325 RAB7; member RAS oncogene family-like 1 9.8 

320187 T99949 Hs.303428 EST duster (not in UniGene) 9.8 

320211 AL039402 Hs.125783 DEME-6 protein 7.9 

45 320324 AF071202 Hs.139336 ATP-binding cassette; suWamlly C (CFTR 562 

320455 R49889 H 3^41 44 EST duster (not in UniGene) 8.3 

320464 AI089817 Hs237146 ESTs 54 

320561 NMJJ06953 Hs.159330 EST cluster (not In UniGene) 7 

320574 AL049443 Hs.161263 Homo sapiens mRNA; cDNA OKFZp5B6N2020 (f 44 

50 320576 AL049977 Hs.162209 Homo sapiens mRNA; cONA DKFZp564C122 (fr 6.7 

320654 AW263086 Hs.1 181 12 ESTs 6 

320796 AF038966 H&31218 secretory carrier membrane protein 1 135 

320800 AI681006 Hs.71721 ESTs 62 - 

320813 AW360847 Hs.16578 ESTs 9.3 * 

55 320853 AI473796 Hs.135904 ESTs 8.1 

320856 D59945 Hs.65366 EST cluster (not in UniGene) 6 

320899 AA633772 Hs.116798 ESTs 92 

320918 AW195012 Hs.293970 ESTs 5 

320973 H19732 Hs.247917 ESTs 5.9 

60 321099 AA018386 Hs.64341 ESTs 4.6 

321190 H52462 Hs.163872 EST cluster (not In UniGene) 5.8 

321318 AB033041 Hs.137507 EST cluster (not In UniGene) 8.4 

321382 AW372449 Hs.175982 EST cluster (not In UniGene) 7.3 

321441 AW297633 Hs.118498 ESTs 14.7 

65 321538 H80433 Hs.46903 EST duster (not In UniGene) 92 

321609 H86021 Hs.162538 ESTs; Weakly similar to hMmTRAlb [H.sapl 4.8 

321636 AI781638 Hs.193465 ESTs 5J5 

321638 AI356352 Hs.108932 ESTs 4.8 

321644 AI204177 H8237396 ESTs 6.6 
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321681 


AA233821 . 


Hs.190173 EST duster (not in UnlGene) 


4.6 


321726 


X91221 


Hs.144465 EST duster (not In UnlGene) 


5 


321758 


U29112 


Hs.196151 EST cluster (not In UnlGene) 


6.2 


321877 


AU09784 


Hs.189222 EST cluster (not in UniQene) 


4.6 


321899 


N55158 


Hs29468 ESTs 


4.6 


321902 


AA746374 


Hs.145010 ESTs 


82 


322007 


AW410646 


Hs.164649 ESTs 


5.1 


322055 


AL137646 


Hs.146001 EST cluster (not In UniQene) 


4.3 


322092 


AF085833 


Hs.135624 EST cluster (not in UniGene) 


4.3 


322221 


AI890619 


Hs. 179662 microsome assembly protein 1-like 1 


4,4 


322278 


AF086283 


EST cluster (not In UniGene) 


5.8 


322303 


W07459 


Hs. 157601 EST duster (not in UnlGene) 


22 


322437 


AW393804 


Hs.170253 ESTs; Weakly similar to rabaptto-4 [H.sa 


4.4 


322493 


AF143235 


Hs279819 EST duster (not in UniGene) 


72 


322782 


AA056060 


Hs.202577 EST duster (not in UniGene) 


18.4 


322811 


AA782292 


Hs.105872 ESTs 


6.9 


32281B 


AW043782 


Hs293616 ESTs 


10,7 


322826 


AI8078S3 


Hs.180059 ESTs 


5 


322887 


AI986306 


Hs.86149 ESTs; Weakly similar to KIAA0969 protein 


11.9 


322889 


AA081924 


Hs.124918 ESTs 


7.1 


322924 


AA669253 


Hs.136075 ESTs 


4.5 


322982 


A1351191 


Hs.128430 ESTs 


6.6 


322994 


AA422116 


Hs.191461 ESTs 


4.7 


323040 


AA336609 


Hs.10862 ESTs 


6.9 


323041 


AL118747 


Hs26691 EST duster (not in UniGene) 


8.3 


323045 


AA146950 


Hs.188836 ESTs 


4.6 


323048 


AL1 18923 


Hs.175110 EST duster (not in UnlGene) 


7.5 


323070 


AA1 57726 


Hs264330 ESTs 


7JS 


323071 


AA157867 


Hs.5722 ESTs 


4.7 


323097 


Z44354 


Hs296261 guanine nucleotide binding protein (G pr 


4.9 


323131 


AA176982 


Hs270124 EST duster (not in UniGene) 


8.1 


323136 


AL120351 


Hs.30177 EST duster (not in UniGene) 


4.3 


323175 


AI827137 


Hs.336454 ESTs 


62 


323216 


AF131846 


Hs. 1 3396 Homo sapiens done 25028 mRNA sequence 


6.3 


323226 


AF055019 


Hs21906 Homo sapiens done 24670 mRNA sequence 


12.6 


323236 


AA363148 


Hs293960 ESTs 


10.9 


323262 


AI829770 


Ks.190642 ESTs 


7.6 


323276 


AA836452 


Hs.323822 ESTs 


7.6 


323287 


AA639902 


Hs.104215 ESTs 


24.7 


323335 


AI655499 


Hs.161712 ESTs 


14.1 


323341 


AL134875 


HS.108646 ESTs 


5.3 


323362 


AL135067 


Hs.1 17182 ESTs 


6.1 


323486 


C05278 


Hs299221 ESTs; Moderately similar to [PYRUVATE DE 


8.5 


323496 


AI826801 


Hs.300700 ESTs 


4.5 


323507 


H71721 


Hs.128387 ESTs 


4.4 


323545 


A1814405 


HS224569 ESTs 


5.8 


323623 


AA314280 


Ks.146589 EST cluster (not in UniGene) 


5 


323663 


AW263526 


HS243023 ESTs 


7.7 


323691 


AA317561 


Hs.145599 EST duster (not in UnlGene) 


5.9 


323810 


AA740405 


Hs.108806 ESTs 


62 


323846 


AA337621 


Hs.137635 ESTs 


6 


323929 


AA354940 


Hs.145958 ESTs 


10.7 


323959 


AI636775 


Hs.6831 ESTs 


5.4 


323996 


AA367032 


Hs217882 ESTs 


5.8 


323997 


AA844907 


Hs.274454 EST duster {not In UniGene) 


4.4 


324019 


A W1 77009 


EST duster (not in UniGene) 


4.6 


324130 


AL046575 


Hs.130198 ESTs 


11 


324295 


AI146686 


Hs.143691 ESTs 


13.7 


324298 


A1524039 


Hs.192524 ESTs 


6.8 


324307 


AA627642 


Hs.4994 transducer of ERBB2; 2 (TOB2) 


4.9 


324330 


AAS84766 


EST duster (not to UniGene) 


4.3 


324385 


F28212 


Hs284247 EST duster (not in UniGene) 


4.7 


324430 


AA464018 


Hs.184598 EST duster (not in UnlGene) 


13.6 


324452 


AW014022 


Hs.170953 ESTs 


7.6 


324547 


AW501974 


Hs.74170 ESTs 


5.6 


324603 


AW016378 


Hs292934 ESTs 


242 


324617 


AA508552 


Hs.195839 ESTs 


54 


324616 


AI346282 


Hs.87159 ESTs 


4.6 


324620 


AA448021 


Hs.94109 EST duster (not in UniGene) 


5.7 
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324626 AI685464 

324658 AI694767 

324676 AW503943 

324691 A1217963 

324696 AA641092 

324713 AW340249 

324716 AI739168 

324718 A1557019 

324720 AA578904 

324752 AI279919 

324753 AA612626 
324790 AI334367 
324601 A1818924 
324804 AI692552 
324645 AA361016 
324888 AI564134 
324929 A1741633 
324961 AA613792 
325108 AA401863 
326816 



327098 
328492 
329362 



330020 
330211 

330384 M23263 
330430 HG2261-HT2352 
330548 U31382 
330551 U39840 
330658 AA319514 
330700 AA037415 

330704 AA056557 

330705 AA102571 

330706 AA121140 
330712 AA167269 
330725 AA252033 
330732 AA281092 

330762 AA449677 

330763 AA450200 
330772 AM79114 
330786 060374 
330892 AA149579 
330949 H01458 
330977 H20826 
331017 N24619 
331099 R36671 
331128 R51361 
331151 R82331 
331195 T64447 

331320 AA262999 

331321 AA278355 
331337 AA287662 
331348 AA400596 
331359 AA416979 
331383 AA454543 
331422 F108Q2 
331442 H77381 
331466 N21680 
331479 N27154 
331490 N32912 
331493 M34357 
331561 N82780 
331615 N92352 
331659 W46868 
331696 Z38907 
331811 AA404500 



ESTs 

Hs.129179 ESTs 
Hs.1 12451 ESTs 

Hs.293341 ESTs; Weakly similar to Pro*a2(Xl) [H.sa 
Hs.257339 ESTs 
Hs.163440 ESTS 

Hs.131798 EST cluster (not in UniGene) 
Hs.1 16467 ESTs 
Hs.292437 ESTs 

Hs.272072 ESTs; Moderately similar to!!!! ALU SUB 
Hs.144871 EST cluster (not in UniGene) 
Hs.159337 ESTs 
Hs.14553 ESTs 
ESTs 

Hs.337533 ESTs 
Hs.136102 K1AA0853 protein 
Hs.125350 ESTs 

EST cluster (not in UniGene) 
Hs.22380 ESTs 

CH.20Jusgi|6552458 
CH.21_hsgi|5B67660 
CH.21_hsgi 6682516 
CH.07_hsgi 5868455 
CHJLhsgi|5868837 
CH.16j)2gi 6165201 
CH.16ji2gi 5091594 
CH.16_p2gi 6671887 
CH.05_p2gi 6013592 
androgen receptor (dihydrotestosterone r 
Hs.321110 

Hs399867 guanine nucleotide binding protein 4 

hepatocyte nuclear factor 3; alpha 
Hs-30732 ESTs 
Hs20999 ESTs 
Hs.6759 ESTs 
Hs.157078 ESTs 

Hs.177576 ESTs; Moderately similar to kynurenine a 
Hs.52620 ESTs 

Hs.24052 ESTs; Weakly simitar to ill! ALU SUBFAMi 
Hs.35254 ESTs 

Hs.15251 Human DNA sequence from done 437M21 on 
Hs.143187 FK506-blndtng protein 3 (25kD) 
Hs.1 1356 ESTs 
EST 

Hs.91202 ESTs 
H&142896 ESTs 
Hs.315181 ESTs 
Hs.108920 ESTs 
Hs.14846 ESTs 
Hs.268714 ESTs 
Hs.268838 ESTs 
Hs/168439 ESTs 
Hs.300141 ESTs 
Hs.87929 ESTs 
Hs.1 18630 ESTs 
Hs.88143 ESTs 
Hs.81897 ESTs 
Hs.43543 ESTs 

Hs.237339 ESTs; Moderately similar to !!!! ALU SUB 
Hs.41223 ESTs 
Hs.43455 ESTs 
Hs.44076 ESTs 

Hs^91039 ESTs; Weakly similar to hypothetical 43. 
Hs.93817 ESTs 
Hs.48703 ESTs 
Hs.5472 ESTs 
Hs.334305 ESTs 
Hs.65949 KIAA0888 protein 
Hs.187958 ESTs 
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331848 AA417039 Hs.98268 signal recognition particle 72kD 75 

331873 AM29445 Hs.98640 ESTs 65 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC done CtT 33.6 

331867 AA460158 Hs.99589 KIAA1 028 protein 6.8 

5 331974 AA464518 Hs.105322 ESTs 55 

332043 AA490831 Hs201591 ESTs 10.8 

332076 AA599477 Hs291156 ESTs 4.4 

332173 F09281 Hs.100725 ESTs 55 

332247 N58172 ESTs 142 

10 332249 N62096 Hs.194140 ESTs 72 

332325 T79428 Hs.339667 ESTs 5.6 

332396 AA340504 ESTs; WeaWy similar to slmflarto human 212 

332434 N75542 Hs237731 transcription factor 4 15.3 

332493 N95495 Hs.56729 ESTs; Highly similar to GTP-binding prat 7.1 

IS 332522 L38503 Hs.178357 glutathione S-transfeiase theta 2 65 

332526 AA281753 Hs.17731 inositol 1;4;5^phosphate receptor; ty 55 

332530 M31682 Hs. 19260 inhlbin; beta B (activln AB beta polypep 55 

332533 M99487 Hs.325825 folate hydrolase (prostate-specific memb 38.1 

332538 N48715 Hs20991 ESTs 65 

20 332546 D84454 Hs22587 solute carrier family 35 (UDP-galactose 45 

332594 AA279313 Hs.32951 methyl CpG binding protein 2 55 

332610 AA412405 Hs.40513 ESTs; WeaWy similar to BETA GAUCTOSID A 5.6 

332661 N95742 Hs5390 ESTs 6.9 

332697 T94885 Hs.75725 carboxypeptidase E 245 

25 332712 D26070 Hs.79306 inositol 1;4;5-Wphosphate receptor; ty 9.9 

332716 L00058 Hs.79630 v-myc avian myelocytomatosis viral oncog 5.6 

332726 R72029 Hs.83428 synaptophysln-like protein 5 

332781 AA233258 ESTs; Weakly similar to 01 0075 [C.elega 45 

332797 CH22LFGENES5J 305 

30 332798 CH22L_FGENES5_5 665 

332799 CH22_FGENES.6„6 195 

332933 CH22.FGENES.38_7 55 

332980 CH22.FGENES54J 55 

332984 CH22J=GENES54_6 4.9 

35 333168 CH22_FGENES.94J 4.7 

333169 CH22.FGENES.94_2 4.4 

333452 CH22J=GENES.157J 45 

333456 CH22_FGENES.157_5 45 

333458 CH22_FGENES.157_7 45 

40 333611 CH22J=GENES217_6 4.7 

333621 CH22J=GENES219_5 5.5 

333814 CH22_FGENES282_2 7.1 

333849 CH22_FGENES290_8 62 

333949 CH22J=GENES.303_5 45 

45 333951 CH22_FGENES503_7 4.9 

333955 CH22LFGENES.303J 1 55 

334150 CH22.FGENES.339J 5.1 

334223 CH22.FGENES560.4 205 

334297 CH22_FGB4ES572_3 9.4 

50 334443 CH22_FGENES587_2 4.6 

334444 CH22LFGENES587J 5.6 

334447 CH22JH3ENES.387.7 13.1 

334570 CH22_FGENES.405_11 5.4 ■ 

334749 CH2a.FGENES.427J 55 

55 334777 CH22_FGENES.430_9 47 

334960 CH2<LFGENES.485_29 52 

335179 CH22J=GENES.504_9 85 

335293 CH22_FGENE$527_6 4.7 

335550 CH22_FGENES576J 1 5.1 

60 335581 CH22.FGENES581.19 5.7 

335586 CH22.FGENES581J5 45 

335809 CH22_FGENES.617_6 62 

335810 CH22_FGENES.617_7 55 
335822 CH22.FGENES519J 7.1 

65 335824 CH22.FGENES519J 1 85 

335853 CH22_FGENES526_5 4.3 

335886 CH22J=GENES532_4 45 

336034 CH22JGENES.678_5 65 

338441 CH22_FGENES527_7 7.6 
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336624 


CH22_.R3ENES.6-3 


433 


336625 


CH22_FGENES.&4 


37.9 


336679 


CH2?J : GENES.43-7 


5.3 


337577 


CH22_.C65E1.GENSCAN.8-1 


4.9 


338255 


CH2^EMAC0Q5500.GENSCAN.276-3 


134 


338260 


CH2^.EMiAC005500.GENSCAN^79-10 


4.6 


338561 


CH22.EM^C005500.GENSCAN>»21^5 


4.6 


338562 


CH22_EM:AC005500.GEN$CAN.421-6 


4.3 


338759 


CH22_EMAC005500.GENSCAN^17-6 


5.1 


338763 


CH22_EM_AC005500.GENSCAN.517-16 


5.5 


338764 


CH22„EMiAC005500.GENSCAN317-17 


7.1 
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TABLE 3A shows the accession numbers for those primekeys lacking unigeneLD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

1 0 Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 

IS Pkey CAT number Accession 

123619 371681J AA602984 AA609200 

116722 143512 1 Z24878 AM94098 F13654 AA434040 AA143127 

103677 41847 T Z83806 AJ132091 AJ132090 

20 125992 1589048J H48372 W0 1626 

109342 ganbank_AA213620 AA213620 

125154 genbank_W38419 W38419 

101447 entre?_M21305 M21305 

124357 genbankJi22401 N22401 
25 108910 genban!eAA136590 AA136590 

322278 47271 1 W69304 AF086283 W69200 

315084 350959 1 AI821 085 AW973464 AA554802 A1821 831 AA657438 AA640756 AA650339 

324019 262792J AW177009 AI381610 

324330 300543J AA884766 AW974271 AA592975 AA447312 

30 324626 33641 1J AI685464 AW971336AA513587 AA525142 

303029 37699J AF1 9961 3 AF1 08756 

324804 308093 1 AI692552 AI393343 AI80051 0 AI37771 1 F24263 AA661876 

324961 376239 J AA613792 AW182329 T05304 AW858385 

329362 OJLhs 
35 336624 CH22_4071FG_6_3_ 

336625 CH22_4072FG_6jL 

336679 CH22_4157FGjl3JL 

338255 CH22_6856FG_|.INK_EMAC00 

338260 CH2£.6663FGJ.(NICEM:AC00 
40 329929 c16_p2 

329960 c16_p2 

338561 CH22J294FG_LINK_EM:AC00 

338562 CH22„7295FG_UNieEMACOO 
338759 CH22_7581FG_LINK_EM:AC0O 

45 338763 CH22 7585FG_LINK_EM:AC00 
338764 CH22_7586FCL_UNK_EMAC00 

333168 CH22 400FG_94JJJNK_EM:A 

333169 CH22_401FG_94J2_UNK_EMA 
333452 CH22_702FGJ57_1_LINK_EM: 

50 333456 CH22_706FGJ57_5JJNK_EM: 
r 333458 CH22.708FQ_157_7_LINK-EM: 

333811 CH22J72FG_^17..6JJNK_EM: 

333821 CH22_882FG_219.5„UNK„EM: 

333814 CH22_1083FGL282JLUNK_B/I 
55 ' 333849 CH22J118FGJ90_8_UNK-EM 

335179 CH22_2515FG_504_9JJNK_EM 

333949 CH22 1225FG_3Q3_5_UNK_EM 

333951 CH22 1227FG_303_7JJNK_EM 

333955 CH22 1231FG_3Q3_11JJNK_E 
60 335293 CH22J635FG_527_6_UNK_EM 

326816 C20_hs 

326997 C21_hs 

335550 CH22_2905FG_576J1_UNK_E 
335581 CH22_£938FG_581_19JJNK_E 
65 335586 CH22_2944FG_581_25_UNK_E 
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328492 cJMis 

335809 CH22_3181FQ.617_6_LINK.EM 

335810 CH22_3182FQ_617_7_UNK_EM 
335822 CH22_3195FG_619_7_UNICEM 
335824 CH22_3197FG_619_11_UNK.E 
335853 CH22_3228FG_626_5_UNK_EM 
335888 CH22_3261FG.632_4_UNK_EM 
330020 c16_p2 

330211 c_5_p2 

337577 CH22_5B64FGJUNK_C65E1.G 
307848 AI384186 

332797 CH22 13FG_6_2_UNK_C4G1.G 

332798 CH22J4FG_6_5_LINK_C4G1.G 

332799 CH22.15FG_6_6JJNKjC4G1.G 
334150 CH22J429FG_339JJJNK_EM 
332933 CH22J 54FG .38.7 J.INK-C20H 
332980 CH2^204FGJ54_1.UNK_EM:A 
332984 CH22_208FG.54_6_UNK.EMA 
334223 CH22J507FG„360_4_UNK_EM 
334297 CH2^.1588FQ.372_3_UNK_EM 
327098 C21_hs 

334443 CH22_1742FG_387J_LINK.EM 

334444 CH22J743FG_387_4_UNieEM 
334447 CH22.1746FG_387_7_LINK„EM 
334570 CH22_1875FGJ05_11_UNK_E 
334749 CH22_2061FQ_427_1_UNK_EM 
334777 CH22_2089FG_430_9_UNK_EM 
336034 CH2S_3419FG_678_5.UNK_DJ 
334960 CH22_2281FG_465_29_UNK_E 
336441 CH22_3861FGJ27_7.UNK.DJ 

330551 9851_2 U39840 NMP04496 AW135607 BE087458 BE087567 M1771 16 AW195705 AW750756 AI811008 AI694151 

BE348594 AW971075 AI347950 AI201455 AI073898 AA652680 AA613671 AI318364 AA507650 AA693692 
AI032599 AA991871 AI269801 AW948974T74639 AA532907 AW949173 

330786 53973_3 BE379594 AI192455 AL039862 AI744012 AI761735 AW243181 AI743687 AI928223 AI423022 AI627855 

AI636059 A1651571 AW802044 AI826995 AI431733 AI539125 AA863056 AW270910 AI768930 AW008835 
AW615183 AW591 147 AI695294 AI672106 AA506358 A1308060 AA011556 AA962437 AI935488 BE219625 
AI004356 AW151394 AI218466 N66178 AI419784 AW242519 AW946907 D60374 AA989263 A1698799 
AA470460AI824167 

332247 372969 1 AA669097AA513815 AA026798 AA676526AA7044-9AA704269AW1 18292 AA579216 N58172 

332398 20265 T AW579842BE156562BE156690BE156489BE0810KAK001559BE149402 M85387AW367811 AW367798 

R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 R53463 H11063 
AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 AW954769 AAO36808 
BE168063 AW382073 AW382085 AL041475 H80748 AI078181 BE483983 AI805213 AI761264 W94885 
N94502 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 AI352312 AI367474 AW204807 
AI675502 AI337028 AW134715 BE328451 AI123157 AI560020 AI300745 AI608631 AI248873 AA742484 
AW051635 H18646 AI245045 AA507111 A1640510 AI925594 AA115747 AA143035 AA151106 
332781 32044J AK001764 BE313896 AA380199 AA380151 AA194996 AW1 18089 AA495871 AW975219 AW085598 

At378909 AW992310 AW992409 AI911857 AA657643 A1804471 A1242589 AI623968 R09556 AI129100 
A1206500 AA680094 AA677784 AI023178 AI277519 AA424742 A1240654 AA232846 AI604273 AI382376 
AA001729 W90790 BE090656 AW295015 AI674596 A1431734 AW20517 AW769185 A1128355 AI192474 
AI820001 AA001929 AA706925 AI076676 AI4991 19 AI200493 AI695919 AI376217 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 A1674387 AJ872616 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. "Dunham I. et al refers to the 

pubGcatfon entitled The DNA sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:469495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 



Strand NLposition 



333611 
333821 
333814 
333849 
333949 
333951 
333955 
334150 
334297 
334443 
334444 
334447 
334570 
334777 
335179 
335581 



335810 
335822 



336034 
336441 
337577 
338260 
332797 
332788 
332799 
332933 
332980 
332984 
333168 
333169 
333452 
333456 
333458 
334223 
334749 
334960 
335293 
335550 
335853 
336624 
336625 
336679 
338255 
338561 



338759 
338763 
338764 



Dunham, I. etal 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, I. etal. 
Dunham, 1. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etaf. 
Dunham, L etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, I. etal 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, I. etal 
Dunham, I. etal. 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, I. etal 
Dunham, I. etal. 
Dunham, 1. etal 
Dunham, 1. etal 
Dunham, I. etal. 
Dunham, I. etal 
Dunham, total 
Dunham, I. etal. 
Dunham, I. etal. 



Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Pius 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 

pais 

Plus 

Plus 

Pius 

Pius 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 



6548368-6548507 

8597414-8597560 

7894165-7894252 

8018323-8018472 

8589634-8589791 

8592501-8592637 

8597414-8597560 

10529221-10529854 

13420934-13421058 

14298981-14299056 

14306433-14306492 

14308764-14308824 

14994868-14994943 

16259586-16260166 

21634405-21634526 

24976198-24976334 

24990333-24990497 

26310772-26310909 

26314767-26314849 

26364087-26364196 

26376860-26376942 

26934235-26934364 

29014404-29014590 

34187606-34187663 

595377-595678 

15458919-15459257 

216964-216798 

232147-231974 

232421-232307 

2035780-2035681 

5136166-5136019 

2632606-2632457 

3729896-3729788 

3730864-3730767 

5135165-5136019 

2631933-2631797 

5143942-5143306 

12734365-12734269 

16090686-16090106 

20160968-20160795 

22316408-22316275 

24668714-24668658 

26614629-26614506 

227714-227577 

229124-229024 

2035790-2035681 

15242294-15242231 

22311866-22311856 

22312594-22312465 

265B2475-26582199 

26628148-26628009 

26641232-26641101 
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329960 5091594 

329929 6165201 

330020 6671887 

326816 6552458 

326997 5887660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5868837 



Minus 1031-1162 

Minus 156410-156553 

Plus 172397-172491 

Plus 198354-198436 

Minus 71389-72147 

Minus 1061684-1062361 

Plus 59158-59215 

Minus 46094-46241 

Minus 65688-68173 
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TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Ratio of tumor to normal body tissue 



Pkey ExAccn UnigenelD Unigene Title R1 

100819 HG4020-HT4290Hs2387 Transglutaminase 103 

102698 U75272 Hs.1867 progastricsin (pepsinogen C) 10.6 

102869 X02544 Hs572 orosomucoldl 22.6 

105370 AA236476 H&22791 ESTs; Weakly simitar to transmembrane pr 10.3 

105645 AA282138 Hs.11325 ESTs 14 

106094 AM19461 Hs23317 ESTs 10.9 

109014 AA156790 Hs262036 ESTs 15.3 

109562 F01811 Hs. 187931 ESTs; Moderately similar to voltage-gate 10.8 

113021 T23855 Hs.129836 KIAA1028 protein 103 

114124 Z38595 Hs. 125019 ESTs; Highly simitar to KIAA0888 protein 21.3 

122791 AA460158 Hs. 129838 KIAA1028 protein 12.4 

124352 N21626 Hs. 102406 ESTs 102 

301042 AI659131 Hs.197733 ESTs 24.9 

302005 AI869666 Hs.123119 ESTs 36.8 

302410 NMJJ04917 Hs218366 EST duster (not in UniGene) with exon h 26.8 

302881 AA508353 Hs.105314 retaxin 1 (H1) 783 

303344 AA255977 HS250646 ESTs; Highly similar to ubiquitirvconjug 19.5 

303753 AW503733 Hs.9414 ESTs 13 

310431 AI420227 Hs.149358 ESTs 72.9 

311251 AI655662 Hs.197698 ESTs 41.3 

311596 AI682088 Hs.79375 ESTs 26.4 

312153 AA759250 Hs.1 18625 cytochrome b-561 11 

312521 AA033609 H&239884 ESTs 112 

313676 AA861697 Hs.120591 EST cluster (not in UniGene) 13.4 

314171 AI821895 Hs.193481 ESTs 294 

314907 AI672225 K&222886 ESTs 193 

315051 AW292425 Hs.163484 EST 155 

315052 AA876910 Hs.134427 ESTs 20 
317548 AI654187 Hs.195704 ESTs 142 
317869 AW295184 Hs. 129142 ESTs; Weakly simitar to DEOXYRIBONUCLEAS 13.8 
318428 AI949409 Hs.194591 ESTs 123 
318524 AW291511 Hs.159068 ESTs 25.9 
319080 Z45131 Hs23023 ESTs 165 
319763 AA460775 Hs.6295 ESTs 14.3 
320324 AF071202 Hs. 139338 ATP-blndlng cassette; sub-family C (CFTR 562 
321441 AW297633 Hs.1 18498 ESTs 14.7 
322303 W07459 Hs, 157601 EST cluster (not in UniGene) 22 
322782 AA056060 Hs2Q2577 EST cluster (not in UniGene) 18.4 
322818 AW043782 HS293616 ESTs 10.7 
323287 AA639902 Hs.104215 ESTs 24.7 
324603 AW016378 Hs292934 ESTs 242 
324617 AA508552 Hs.195839 ESTs 54 
324658 AI694767 Hs.129179 ESTs 22 
324691 A1217963 Hs293341 ESTs; Weakly similar to Pn>a2{Xt) [Rsa 10.6 
324696 AA641092 Ms257339 ESTs 102 
324718 AI557019 Hs.116487 ESTs 34.4 
330211 CH.05_p2gi|6013592 12,6 
330430 HG2261-HT2352Hs.321110Antig8n ) Prostate Specific, AJL Splice 13.8 
330706 AA121140 Hs,177576 ESTs; Moderately similar to kynurenlne a 145 
330762 AA449677 Hs.15251 Human DNA sequence from clone 437M21 on 185 
330892 AA149579 Hs.91202' ESTs 153 
330949 K01458 Hs. 142896 ESTs 103 
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331099 R36671 Hs.14846 ESTs 11.8 

331151 R82331 H&268838 ESTs 13 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAG clone CIT 33.6 

332247 N58172 ESTs 14.2 

332396 AA340504 ESTs; Weakly similar to slmllarto human 212 

332533 M99487 Hs.325825 folate hydrolase (prostate-specific memo 38.1 

332697 T94885 Hs.75725 carboxypeptidase E 243 

332797 CH22_.FGENES.6J 30.8 

332798 CH22_.FGENES.6J5 66.8 

332799 CH22_FGENES.6_6 19.8 
334223 CH22J=GENES.360„4 20.3 

336624 CH22_.FGENES.6-3 433 

336625 CH22._FGENES.6-4 37.9 
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TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset identifier number 

CAT number: Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number 



Accession 



336624 CH22J071FG 6_3_ 

336625 CH22_4072FGJL4_ 
330211 c_5_p2 

332797 CH22_13FG - 6JJJNrLC4G*1.G 

332798 CH22J4FGL6 5JJNK-C4G1.G 

332799 CH22SJ5FGL6 6 UNKJC4G1.G 
334223 CH22J507FG_360_4_LINK_EM 
332247 372969J 

332396 20265J 



AA669097 AA513815 AA026798 AA676528 AA704429 AA704269 AW1 18292 AA579216 N58172 
AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85367 AW36781 1 
AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW982899 AA713530 AW892948 
R53463 H11063 AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 
AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 BE463983 
AI805213 AI761264 W94885 N94502 AI623772 AI419532 A 181 0302 AI634190 AW0Q2516 AW150777 
AI352312 AI367474 AW204807 AI675502 AI337026 AW134715 BE328451 AI123157 A1560020 
AI300745 AI608631 AI248873 AA742484 AW051635 H 18646 AI245045 AA507111 AI640510AI925594 
AA1 15747 AA143035 AA151106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in Ihls column are Qenbank Identifier (Gl) numbers. "Dunham L et al." refers to the publication entitled The 

DNA sequence of human chromosome 22." Dunham L et ai., Nature (1 999) 402:489495, 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332797 Dunham, I. etaL 

332798 Dunham, I. etaL 

332799 Dunham, I. etaL 
334223 Dunham, I. etaL 

336624 Dunham, I. et.al. 

336625 Dunham, i. eta!. 
330211 6013592 



Strand 


NLposiSon 


Minus 


216964-216798 


Minus 


232147-231974 


Minus 


232421-232307 


Minus 


12734365-12734269 


Minus 


227714-227577 


Minus 


229124-229024 


Plus 


59158-59215 
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TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1 170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85 th percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85* percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5 th percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 
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Pkey: 




Unique Eos probeset identifier number 


ExAocn: 




Exemplar Accession number, Genbank accession number 


UnigenelD: 




Unigene number 




Unigene Title: 


Untgene gene tills 


> 


R1: 




Ratio of tumor to normal tissue 


Pkey 


ExAccn 


UnigenelD 


Unigene Title 


446057 


AJ420227 


Hs.149358 


ESTs, Weakly similar to A46010 X-llnked 


400302 


N48056 


Hs.1915 


folate hydrolase (prostate-specific memb 


414569 


AF109298 


Hs.1 18258 


prostate cancer associated protein 1 


417407 


AA92327B 


Hs290905 


ESTs, Weakly similar to protease [H.sapi 


431579 


AW971082 


Hs222888 


ESTs, Weakly simuar to TRHY_HUMAN TRICH 


409361 


NNL0Q5982 Hs.54416 


sine oculis homeobox (Drosophila) homolo 


409731 


AA125985 


HS.56145 


thymosin, beta, identified in neuroblast 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


420154 


AJ093155 


Hs.9542Q 


JM27 protein 


433466 


AA508353 


Hs.105314 


relaxln 1 (H1) 


400236 


AA305627 


Hs.139336 


ATP-blndlng cassette, sub-family C (CFTR 


400292 


AA250737 


Hs.72472 


ESTs 


432887 


AJ926047 


Hs.162859 


ESTs 


439176 


AJ446444 


Hs.190394 


ESTs, Weakly simitar to 826096 line-1 pr 


430722 


AW968543 


Hs2Q3270 


ESTs, Weakly similar to ALUljiUMAN ALU S 


437052 


AA861697 


Hs.120591 


ESTs 


418396 


AI765805 


Hs26691 


ESTs 


434036 


AI659131 


Hs.197733 


hypothetical protein M6C2849 


407709 


AA456135 


Hs23023 


ESTs 


426747 


AA535210 


Hs.171995 


kallikreln 3, (prostate specific antigen 


407168 


R45175 




ESTs 


440260 


AJ972887 


Hs.7130 


copine IV 


421513 


X00949 


HS.105314 


relaxin 1 (H1) 


416370 


N90470 


Hs2Q3697 


ESTs, Weakly similar to I38022 hypotheti 


407122 


H20276 


Hs31742 


ESTs 


400287 


$39329 


Hs.181350 


kallikreln 2, prostatic 


432244 


AI669973 


Hs200574 


ESTs 


451939 


U80456 


Hs27311 


single-minded (Drosophila) homotog 2 


415989 


AI267700 


Hs.1 11 128 


ESTs 


418961 


AW967646 


HS-23023 


ESTs 


425628 


NMJJ04476 Hs.1915 


folate hydrolase (prostate-specific memb 


458509 


AA654650 


Hs282906 


ESTs 


448290 


AK002107 


Hs20843 


Homo sapiens cONA FU1 1245 fis, clone PL 


428336 


AA503115 


Hs.1 83752 


microsemlnoproteln, beta- 


450096 


A1682088 


HS223368 


hotocarboxylase synthetase (biotin-fprop 


400299 


X07730 


Hs.171995 


kaEQkrein 3, (prostate specific antigen 


437571 


AA760894 


Hs.153023 


ESTs 


453160 


AJ263307 


HS.146228 


H2B hlstone family, member L 


453096 


AW294631 


Hs.1 1325 


ESTs 


425075 


AA506324 


Hs.1852 


add phosphatase t prostate 


407202 


N58172 


Hs.109370 


ESTs 



R1 

8642 
66.46 
53.36 
56.16 
5338 
4828 
4524 
43.48 
41.12 



3842 
38.00 



36.45 
3320 
33.02 
32.68 
3244 
32.10 
3130 
31.72 
3032 
30.10 
29.68 
2924 
28.90 
28.74 
28.74 
28.34 
2734 
2732 
2724 
27.16 
26.17 
2530 
2431 
24.74 
2436 
2446 
2423 
24.18 
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424846 
453370 
422805 
444917 
408826 
413597 
426429 
435981 
432966 
418848 
405685 
443271 
418819 
420757 
418994 
429918 
415539 
450382 
418829 
429984 
443822 
431676 
410330 
432441 
452792 
445472 
414565 
430487 
431716 
4.19536 
439677 
449625 
408430 
447033 
453006 
431474 
420218 
408000 
416208 
430226 
415263 
432437 



429900 

449156 

411098 

435974 . 

444484 

422728 

418601 



445885 
452712 
432189 
424565 
429290 
419264 
416445 
407275 
408369 
446720 
434988 
448172 
416182 
420544 
445413 



AU077324 

AI470523 

AM36989 

R68651 

AF216077 

AW302885 

X73114 

H74319 

AA650114 

A1820961 



407819 
433444 



AA228776 

X78592 

AA296520 

AW873986 

AI733881 

AA397658 

AA516531 

AL0501Q2 

AI087412 

AI685464 

AW023630 

AW292425 

AB037765 

AB006631 

AA502972 

D87742 

D69053 

AA603305 

R82331 

NM.014253 

S79876 

AI357412 

A1362575 

AL133990 

AW958037 

L11690 

AW291168 

6E245562 

AA948033 

W07088 

AI249388 

AA460421 

AF103907 

U80034 

U29690 

AK0Q2126 

AW937826 

AA279490 

AF179274 

AI734009 

AW838616 

AA527941 

AW102723 

AF203032 

AA877104 

AL043004 

AI364186 

R38438 

AI439138 

A1418055 

N75276 

NM_004354 

AA677577 

AA151342 

AA889120 

R42185 

AW975324 



Hs.1832 

Hs.182356 

Hs.121017 

Hs.144997 

Hs.48376 

Hs.1 17183 

Hs.169849 

Ms.188620 

Hs.193465 

Hs.195704 

Hs.191721 

Hs.99915 

Hs.89546 

Hs. 119333 

Hs.72472 

Hs.60257 

Hs.55999 

Hs227209 

Hs.143611 

Hs292638 

Hs.46786 

Hs.163484 

Hs50652 

Hs.12784 

Hs.183390 

H&241552 

Hs268012 

Hs.164599 

Hs23796 

Hs.44926 

Hs.157601 

Hs.167133 

Hs.190642 

Hs22437 

Hs.620 

Hs.41295 

Hs2551 

Hs.130853 

Hs293685 

Hs.98558 

Hs.30875 

Hs.171353 



Hs.37744 

Hs.103262 
Hs.86368 
Hs22791 
Hs.127699 



Hs.75295 
Hs.198760 
Hs293672 
Hs.300678 

Hs.182575 

Hs.140546 

Hs.161160 

Hs.135904 

Hs.79069 

Hs.98732 

Hs.12677 

Hs.1 10637 

HS2748Q3 

Hs.129816 



neuropeptide Y 

ATP-blnding cassette, sub-family C (CFTR 

H2A histone family, member A 

ESTs 

Homo sapiens clone HB-2 mRNA sequence 
ESTs 

myosin-blnding protein C, slow-type 

ESTs 

ESTs 

ESTs 

ESTs 
ESTs 



setectin E (endothelial adhesion molecul 

ESTs 

ESTs 

Homo sapiens cONA FU13598 fis, clone PL 

NK homeobox (Drosophila), family 3, A 

hypothetical protein FU21617 

ESTs, Weakiy similar to 2004399A chromos 

gb:tt88f04jc1 NCI_CGAP_Pr28 Homo sapiens 

ESTs 

ESTs 

KIAA1344 protein 

Homo sapiens mRNA for KIAA0293 gene, par 
hypothetical protein FU13590 
KIAA0268 protein 

fatty-acid-Coenzyme A ligase, long-chain 
gb:np12d11.s1 NCLCGAP_Pr3 Homo sapiens 
ESTs 

odz (odd Oz/ten-m, Drosophiia) homolog 1 

dipeptidytpeptidase IV (CD26, adenosine 

ESTs 

ESTs 

ESTs 

ribosomal protein L4 

bullous pemphigoid antigen 1 (230/240kD) 

ESTs, Weakly similar to MUC2JUJMAN MUCIN 

adrenergic, beta-2-, receptor, surface 

ESTs 

ESTs 

ESTs 

ESTs 

prostate cancer anBgen 3 

mitochondrial intermediate peptidase 

Homo sapiens beta-1 adrenergic receptor 

hypothetical protein RJ1 1264 

ESTs, Weakly similar to ZN91_HUMAN ZINC 

catmegln 

transmembrane protein with EGF-like and 
KIAA1603 protein 

gb:RC5-LT0054-14020OO13-D01 LT0054Homo" 
gb:nh30cO4^1 NCI_CGAP_Pr3 Homo sapiens 
guanylate cyclase 1 , soluble, alpha 3 
neurofilament, heavy polypeptide (200kD) 
ESTs, Weakly similar to ALUBJWMAN UI1 
KIAA0135 protein 

gb:qw34h07jc1 NCI_CGAP_Ut4 Homo sapiens 

solute carrier family 15 (HWpeptide tra 

ESTs 

ESTs 

ESTs 

cycilnG2 

Homo sapiens Chromosome 16 BAG clone CIT 

CGM47 protein 

homeoboxAlO 

ESTs 

ESTs 



23.57 
23.16 
2252 
2228 
22.02 
21.76 
2132 
21.12 
21.07 
21.06 
20.90 
19.98 
19.94 
19.72 
1956 
19.04 
18.43 
18.34 
18.28 
1752 
17.68 
1754 
1752 
1741 
1759 
1750 
1652 
16.72 
1650 
1650 
16.46 
16.32 
1628 
16.02 
15.74 
15.70 
15.64 
1554 
1548 
15.40 
15.38 
1526 
1521 
1450 
1459 
1451 
14.76 
14.76 
14.60 
1456 
1455 
14.44 
1422 
14.12 
13.78 
1357 
13.40 
13.32 
1324 
1321 
13.06 
13.02 
1258 
1254 
12.79 
12.64 
1252 
1250 
1250 



143 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



421059 


AI654133 


Hs.30212 


thyroid receptor Interacting protein 15 


420077 


AW512260 


HS.87767 


ESTs 


453930 


AM19466 


HS.38727 


hypothetical protein FU 10903 


441610 


AW576148 


Hs.148376 


ESTs 


451009 


AA013140 


Hs.1 15707 


ESTs 


433764 


AW753676 


Hs.39982 


ESTs 


440266 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


443912 


R37257 


Hs.184780 


ESTs 


419526 


AI821895 


Hs. 193481 


ESTs 


423073 


BE252922 


Hs.123119 


MAD (mothers against decapentapteglc, Dr 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


414422 


AA147224 


H8.71814 


ESTs 


450203 


AF097994 


Hs.301528 


L-kynurenina/alpha-amlnoadipate aminotra 


436679 


AI127483 


Hs.120451 


ESTs, Weakly similar to unnamed protein 


440901 


AA909358 


Hs.128612 


ESTs 


446045 


AJ297436 


Hs20166 


prostate stem cell antigen 


433397 


AW204232 


Hs279522 


ESTs 


434980 


AW770553 


Hs293640 


sterol O-acyltransferase (acyl-Coenzyme 


425905 


AB032959 


Hs.161700 


novel C3HC4 type Zinc finger (ring tinge 


434680 


T11738 


Hs.127574 


ESTs 


449650 


AF055575 


Hs297647 


calcium channel, voltage-dependent, L ty 


431173 


AW971198 


Hs294068 


ESTs 


434539 


AW746078 


Hs214410 


ESTs, Weakly similar to MUC2_HUMAN MUCIN 


410037 


AB020725 


Hs38009 


KIAA0918 protein 


417708 


N74392 


Hs.50495 


ESTs 


458332 


AI000341 


HS220481 


ESTs 


420381 


D50640 


Hs.301762 


phosphodiesterase 3B, cGMP-inhibited 


425665 


AK001050 


Hs.159066 


hypothetical protein FU10188 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


428728 


NML016625 


Hs.191381 


hypothetical protein 


407021 


U52077 




gb:Human mariner! transposase gene, comp 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


401714 








434485 


AI623511 


Hs.1 18567 


ESTs 


415786 


AW419196 


Hs257924 


hypothetical protein RJ 13782 


452340 


NM 002202 


Hs.505 


ISL1 transcription factor, LIM/homeodoma 


453628 


AW243307 


Hs.1 701 87 


hypothetical protein 


408063 


BE0BS548 


Hs.42346 


caldneuruvbinding protein calsarcirM 


417687 


AIS28596 


Hs250691 


ESTs 


434668 


AF1511Q3 

or 1 w 1 iw 


Hs.1 12259 


T ceil receptor gamma locus 


432374 


W68815 


Hs 301885 


Homo sapiens cDNA FU11346 fis, clone PL 


428819 


AL1 35623 


Hs.193914 


K1AA0575 gene product 


413409 


AI638418 


Hs21745 


DEAD/H (Asp-Glu-AIa-Asp/His) box potypsp 


*r£Of fO 


AAA34579 


Hs 143691 


ESTs 


436556 


Al 364997 


Hs.7572 


ESTs 


441690 


R81733 


Hs.33106 


ESTs 


419852 


AW503756 


Hs286184 


hypothetical protein dJ551 D2.5 




MM 014018 


Hs 110488 


K1AA0990 DfDtein 


423698 


AA329796 


Hs.1098 


DKFZp434J1813 protein 




A1922SB8 


Hs 172510 


ESTs 




WE7554 


Hs 125019 

1 10i l£*SV 1 0 


ESTs 






Hs.1 16467 


small nuclear onoteln PRAC 


445424 


AB02B945 


Hs 12696 


cortactin SH3 domain-binding protein 






no. 1 £9 1 1 a 


Homo sanlens cONA FU135B1 fis clone PL 


433104 


AL043002 


Hs.128246 


ESTs, Moderately similar to unnamed prot 


452744 


A1267652 


Hs.30504 


Homo sapiens mRNA; cDNA DKFZp434E082 (fr 


431217 


NM.013427 


Hs250830 


Rho GTPase activating protein 6 


427398 


AW390020 


Hs20415 


chromosome 21 open reading frame 1 1 


446898 


T15767 


Hs22452 


Homo sapiens mRNA tor KIAA1737 protein, 


421470 


R27496 


Hs.1378 


annexin A3 


406554 








401424 








407902 


AL117474 


Hs.41181 


Homo sapiens mRNA; cONA DKFZp727C191 {fr 


423545 


AP000692 


Hs.129781 


chromosome 21 open reading frame 5 


439024 


R96696 


Hs.35598 


ESTs 


431548 


AI834273 


Hs.9711 


novel protein 


409262 


AK000631 


Hs32256 


hypothetical protein RJ20624 


446271 


D32464 


Hs.100469 


ESTs 


448692 


AW013907 


H&224276 


rnethylcrotonoyWwnzyme A carboxylase 2 



12,30 

1224 

1222 

1220 

12.18 

12.16 

12.04 

11.92 

11.91 

11,67 

11.86 

11.76 

11.68 

11.60 

11.60 

11-51 

11J50 

1138 

1133 

1132 

11.18 

11.16 

11.16 

11.14 

11.14 

11.12 

11.10 

11.08 

11.08 

11.04 

11.02 

11.02 

10.90 

1039 

1037 

1035 

10.72 

10.67 

10.64 

1033 

1030 

1048 

1044 

1021 

1020 

10.14 

10.10 

10.04 

10.02 

10.00 

9.98 

9.97 

9.96 

9.88 

934 

9.82 

9.75 

9.70 

9.70 

9.64 

930 

938 

9.56 

934 

931 

948 

9.45 

9.42 

928 
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414140 AA281279 Hs.23317 hypofoetical protein RJ14681 9.24 

435980 AF274571 Hs. 129 142 deoxyribonuctease tl beta 924 

421246 AW582962 Hs.300961 CGW7 protein 9.20 

427304 AA781526 Hs.163853 ESTs 9.16 

5 442914 AW188551 Hs.99519 hypothetical protein RJ14007 9.16 

413627 BE182082 Hs.246973 ESTs 9.14 

439699 AF086534 Hs.187661 ESTs, Moderately similar to ALU1_HUMAN A 9.10 

437718 AI927288 Hs.183779 ESTs 9.07 

439820 AL360204 Hs.283853 Homo sapiens mRNA full length Insert cDN 9.06 

10 447342 AI199268 Hs. 19322 Homo sapiens, Similar to RIKEN cONA 2010 9.05 

446223 BE300091 Hs. 119699 hypothetical protein FU12969 9.04 

410001 AB041036 Hs.57771 kallikrein 11 9.03 

424012 AW368377 Hs.1 37569 tumor protein 63 kDa with strong homolog 9.03 

441791 AW372449 Hs.175982 hypothetical protein RJ21 159 9.02 

IS 448206 BE622565 Hs.3731 EST s, Moderately similar to 138022 hypot 9.02 

414269 AA298489 olfactory receptor, family 51, subfamily 839 

442081 AA401863 Hs.22380 ESTs 8.S8 

420092 AA814043 Hs.88045 ESTs 8.85 

411630 U42349 Hs-71119 Putative prostate cancer tumor suppresso 8.80 

20 421863 AI952677 Hs.108972 Homo sapiens mRNA; cDNA DKFZp434P228 (fr 8X0 

454141 AW138413 Hs.182356 ATP-bindlng cassette, sub-family C(CFTR 8.80 

418278 AI088489 Hs.83937 hypothetical protein 8.78 

428330 122524 Hs.2256 matrix metatloproteinase 7 (matritysln, 8.76 

432415 T16971 Hs.269014 ESTs. Weakty similar to A43932 mucin 2 p 8.75 

25 424906 AI566086 Hs.153716 Homo sapiens mRNA for Hmob33 protein, 3 1 8.74 

415245 N59650 Hs.27252 ESTs 8.72 

442409 BE208843 Hs.129544 hypothetical protein MGC15438 8.70 

404571 8.66 

418033 W68180 Hs.259855 elongation factor-2 kinase 8.64 

30 456497 AW967956 Hs.123648 ESTs, Weakty simitar to AF 108460 1 ublnu 8.56 

405876 854 

448807 A1571940 Hs.7549 ESTs 852 

445372 N36417 Hs.144928 ESTs 8.48 

425171 AW732240 Hs.300615 ESTs 8.44 

35 419968 X04430 Hs.93913 intertaukln 6 (interferon, beta 2) 8.36 

407385 AA610150 Hs.272072 ESTs, Weakly similar to 138022 hypotheti 8.31 

433172 AB037841 Hs. 102652 hypothetical protein ASH1 8.30 

422631 8E218919 Hs.1 18793 hypothetical protein FU1 0688 8.27 

412719 AW016610 Hs.129911 ESTs 854 

40 418849 AW474547 Hs.53565 Homo sapiens PIG-M mRNA for mannosyttran 8.22 

444922 A1921750 Hs.144871 Homo sapiens cONA FU13752 fis, clone PL 8.22 

427674 NMJXJ3528 Hs.2178 H2B hlstone family, member Q 850 

432101 AI918950 Hs.11092 EphA3 8.17 

416288 H51299 gb:yp07c08.s1 Soares breast 3NbHBst Homo 8.15 

45 404915 8.08 

440106 AA864968 Hs.127699 KIAA1 603 protein 8.07 

442861 AA243837 Hs.57787 ESTs 8.06 

452259 AA317439 Hs.28707 signal sequence receptor, gamma (translo 8.06 

443250 AI041530 Hs.132107 ESTs 8.06 

50 437257 AW511443 Hs558110 ESTs 8.04 

452891 N75582 Hs-212875 ESTs, Weakly slmflar to DYH9_HUMAN CILl 8.02 

422219 AW978073 regulator of mitotic spindle assembly 1 8.00 

453049 BE537217 Hs.30343 ESTs 8.00 

439731 AI953135 Hs.45140 hypothetical protein FU 14084 7.98 

55 408554 AA836381 Hs.7323 nuclear receptor co*repressor/HDAC3 comp 7.94 

421154 AA284333 Hs.287631 Homo sapiens cDNA FLJ14269 fls, clone PL 7.94 

430107 AA465293 Hs.105069 ESTs 7.94 

433404 T32982 Hs.102720 ESTs 7.93 

450813 AI739625 Hs.203376 ESTs 7.90 

60 418239 AL038450 Hs.48948 ESTs 735 

448212 AI475858 gb1c87d07j(1 NCi_CQAP_Cai Homo sapiens 732 

449532 W74653 Hs.271593 ESTs, Moderately similar to A47582 B-cel 7.82 

413930 M86153 Hs.75618 RAB11A, member RAS oncogene family 7.80 

458191 AI420611 Hs.127832 ESTs 7.80 

65 444858 AI199738 Hs.208275 ESTs, Weakly similar to ALUAJWMAN ill! 778 

457498 A1732230 Hs.191737 ESTs 7.78 

407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 7.76 

433759 AA680003 Hs.109363 Homo sapiens cDNA: FU23603 fls, done L 7.74 

433805 AA706910 Hs.1 12742 ESTs 7.74 
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426485 NMJXJ6207 Hs. 170040 platelet-derived growth fate receptor- 7.72 

446028 R44714 Hs.106795 Homo sapiens cDNA RJ13136 fe, clone NT 7.72 

418555 AI417215 Hs.87159 hypothetical protein FU12577 7.70 

447499 AW262580 Hs,147674 protocadherln beta 16 7.70 

5 419839 U24577 Hs.93304 phospholipase A2, group VII (platelet-ac 7.68 

416857 AA188775 HS292453 ESTs 7.68 

413801 M62246 Hs.35406 ESTs, Highly similar to unnamed protein 7.66 

425480 AB023198 Hs.158135 KIAA0981 protein 7.66 

420120 AL049610 Hs.95243 transcription elongation factor A [S\\y 7.64 

10 424099 AF071202 Hs.139336 ATP-binding cassette, sub-family C (CFTR 7.64 

446307 T50083 Hs.9094 ESTs 7.63 

429220 AW207206 Hs.136319 ESTs 7.59 

420345 AW295230 Hs25231 ESTs 734 

429208 AA447990 Hs.190478 ESTs 734 

15 447247 AW369351 Hs287955 Homo sapiens cDNA FU 13090 fis, clone NT 733 

440995 T57773 Hs.10263 ESTs 733 

448706 AW291095 Hs21814 Interieukln 20 receptor, alpha 732 

410227 AB0O9284 Hs.61152 exostoses (multip!e)-Uka 2 7.49 

431616 AA508552 Hs.195839 ESTs, Weakly similar to I38022 hypotheti 7.46 

20 434217 AW014795 Hs23349 ESTs 7.44 

431467 N71831 Hs256398 Homo sapiens mRNA; cDNA DKFZp434E0528 (f 7.42 

448519 AW175665 Hs244334 Homo sapiens prostein mRNA, complete cds 7.42 

446791 AI632278 Hs34981 ESTs 7.40 

419743 AW408762 Hs.127478 Homo sapiens clone 24416 mRNA sequence 739 

25 445855 BE247129 Hs.145569 ESTs 7.36 

425211 M18667 Hs.1887 progastricsin (pepsinogen C) 735 

419131 AA406293 Hs.301622 ESTs 734 

400294 N957G6 Hs.1 79809 Homo sapiens prostein mRNA, complete cds 733 

441736 AW292779 Hs.169799 ESTs 728 

30 427701 AM11101 Hs221750 nuclear autoantfoenic sperm protein (his 754 

457733 AW974812 Hs291971 ESTs 724 

418432 M14156 Hs.85112 insulin-tike growth factor 1 (somatomedl 722 

441201 AW1 18822 Hs.128757 ESTs 721 

419953 BE267154 Hs.125752 ESTs 720 

35 419991 AJ000098 Hs.94210 eyes absent (Drosophlla) homotog 1 720 

425018 BE245277 Hs.154196 E4F transcription factor 1 720 

424560 AA158727 Hs.150555 protein predicted by clone 23733 7.18 

435380 AA679001 Hs.192221 ESTs 7.14 

420658 AW965215 Hs.130707 ESTs 7.12 

40 408291 AB023191 Hs.44131 KIAA0974 protein 7.10 

409110 AA191493 Hs.48778 niban protein 7,10 

414485 W27026 Hs. 182625 VAMP (vesicte-associated membrane protel 7.10 

430039 BE253012 Hs.153400 ESTs, Weakly similar to ALU1_HUMAN ALU S 7.10 

450832 AW970602 Hs.105421 ESTs 7.10 

45 417153 X57010 Hs.81343 collagen, type II, alpha 1 (primary oste 7.08 

412446 AI768015 Hs.92127 ESTs 7.07 

412953 Z45794 Hs238809 ESTs 7.06 

418051 AW192535 Hs.19479 ESTs 7.06 

421566 NMJXXB99 Hs.1395 earty growth response 2 (Krox-20 (Drosop 7.04 

50 446999 AA151520 Hs279525 hypomefical protein MGC4485 7.04 

440529 AW207640 Hs.16478 Homo sapiens cDNA:FU21718 fis, clone C 7.04 

441111 A1806867 Hs.126594 ESTs 7.01 

451027 AW519204 Hs.40808 ESTs - 7.00 

408432 AW195262 gb:xn67b05.x1 NCLCGAPJ5ML1 Homo sapiens 7.00 

55 432223 AA333283 Hs285336 Homo sapiens, clone IMAQE:3460280, mRNA 7.00 

444805 AB007899 Hs.12017 homotog of yeast ublquitin-proteEn ligas 6.99 

414212 AA136569 Hs295940 KIAA0187 gene product 6.98 

431725 X65724 Hs2839 Nome disease (pseudogOoma) 6.98 

449685 AW296869 Hs36095 ESTs 6.97 

60 447313 U92981 Hs.18081 Homo sapiens clone DT1P1B6 mRNA, CAQ rep 6.96 

424590 AW966399 Hs.46821 hypothetical protein RJ20086 6.94 

449655 A1Q21987 Hs39970 ESTs 6.92 

419563 AA526235 Hs.193162 Homo sapiens cDNA RJ11983 tls, clone HE 6.90 

434163 AW974720 Hs25206 group XII secreted phospholipase A2 8.89 

65 415809 Z32789 Hs.46601 ESTs 636 

425782 U66468 Hs.1 59525 cell growth regulatory with EF-hand doma 635 

417958 AA767382 Hs.193417 ESTs 634 

427408 AA583206 Hs2156 RAR-related orphan receptor A 6.79 

445873 AA250970 Hs251946 polyfAy-blnding protein, cytoplasmic 1-i 6.74 
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410718 


AI920783 


Hs.191435 


ESTs 


6.74 




432363 


AA534489 




gbmf76g11.s1 NCI_CGAP_Co3 Homo sapiens 


6.74 




438521 


AW203986 


Hs213003 


ESTs 


6J3 




435804 


AA625279 


HS26B92 


uncharacterized bone marrow protein BM04 


6.73 


5 


419083 


AI479560 


Hs.98613 


Homo sapiens cONA RJ12292 fis, clone MA 


6.72 




418245 


AA088767 


Hs33883 


transmembrane, prostate androgen induced 


6.70 




420714 


BE172704 


Hs222746 


KIAA1610 protein 


6.70 




412707 


AW206373 


Hs.16443 


Homo sapiens cDMA: FU21721 fis, clone C 


6.67 


10 


421898 


N62293 


Hs.45107 


ESTs 


6.66 


411078 


A1222020 


Hs.182364 


CocoaCrfsp 


6.66 




452465 


AA610211 


H3.34244 


ESTs 


6.66 




422763 


AA033699 


Hs.83938 


ESTs, Moderately similar to MAS2_HUMAN M 


6.66 




444618 


AV653785 


Hs.300171 


ELL-RELATED RNA POLYMERASE II, EL0NGAT10 


6.64 


15 


450164 


AI239923 


Hs.30098 


ESTs 


6.63 


431060 


AF039307 


Hs249171 


homeoboxAH 


6.62 




408031 


AA081395 


Hs.42173 


Homo sapiens cDNA RJ 10366 fis, clone NT 


6.62 




420285 


AA258124 


Hs293878 


ESTs, Moderately similar to ZN91JWMAN Z 


6.62 




444670 


H58373 


H3.37494 


hypothetical protein MGC5370 


6.62 


20 


444489 


A1151010 

• VI IV IV IV 


Hs.157774 


ESTs 


6.60 


445685 


AW779829 


Hs263436 


gb:hn88a05jc1 NCI CGAP.Kid11 Homosaplen 


6.60 




435677 


AA694142 


Hs293726 


ESTs, Weakly similar to TSQA RAT TESTIS 


6.59 




452221 


C21322 


Hs.11577 


hypothetical protein BJ22242 


6.59 
6.56 




431510 


AA580082 


Hs.1 12264 


ESTs 


25 


415874 


AF091622 


Hs.78893 


KIAA0244 protein 


6.54 


418405 


AIB68282 


Hs.11898 


ESTs, Highly similar to KIAA1370 protein 


6.54 




452768 


AW069459 


Hs.61539 


ESTs 


654 




401451 








6.52 




416289 


W28333 




ESTs 


6.52 


30 


431778 


AL08Q276 


Hs268562 


regulator of G-protein signalling 17 


631 


409089 


NM 014781 


Hsf0421 


KIAA0203 gene product 


630 




442833 


AA328153 


Hs.88201 


ESTs, Weakly similar to A Chain A, Cryst 


6.50 




431892 


NM 002742 


Hs2891 


protein kinase C, mu 


6.49 




418833 


AW974899 


Hs292776 


ESTs 


6.48 


35 


429163 


AA884768 




gb:am20a10.s1 Soares_NFljr - GBC_S1 Homos 


6.46 


430403 


AF039390 


HS241382 


tumor necrosis factor (ligand) superfami 


6.46 




•rWUOO 




Hs.1 6732 


ESTs 


6.46 




A185EU 


AA631143 


Hs 179809 


Homo sapiens prostein mRNA, complete cds 


6.44 




AV)FC7A 
•Wtw/*f 




H&257339 


ESTs, Weakly similar to 138022 hypothetl 


6.44 




423600 


AIB33559 


H&29076 


ESTs' 


6.44 


40 










6.42 




A33fiin 


AAS06822 


Hs 112547 


ESTs 


6.42 








Hs.1 05700 


secreted frfzzled*related protein 4 


6.41 




*tU/ 1 10 


AA1SR7Q0 


Ho2G2036 


ESTs, Weakly similar to Z223.HUMAN ZINC 


6.40 


45 




W7Q7^fl 


Hs.136102 


KIAA0853 protein 


6.40 




AIQOCQ4A 


Hs 104530 


fatal hvMthatical orotsln 

VwKaI lljfvvil tQllwi |/lvl0til 


6.40 




A33!>ft5 


AW975944 


Hs23739fi 


ESTs 


6.40 








Hs 50252 


mitochondria! ribosomal orotsln L32 


6.40 




44fima 
440109 


UQC004 


ns.£ i*tv io 


ESTs 


6.40 


50 


41Q0U0 




Hs 75933 


namxEsonnal bloasnssis factor 7 


6.38 




H57585 


Hs 37467 


ESTs 


6.36 




4O34U0 


RP4ARR3Q 


nSiO If fa 


Hnmn sanlans cDNA FU13591 fis. clone PL 


6.34 




4907RQ 


KIM IYL4Q17 


Hs 218386 


kflilikrain 4 forostasa enamel matrix, o 


634 








Hs 157148 
n*« io* i**v 


hvDothetlcal Droteln MGC13204 

IJPWvvWl WWII t IIIWV 1 **VT 


632 


55 


4ZQ040 




Hq 1*>Qfi97 




632 


439221 


AA737106 


Hs.32250 


ESTs, Moderately similar to 178885 serin 


632 




428194 


AA765603 


Hs.180877 


H3histone, (amity 3B(H3^B) 


830 




431958 


X63629 


Hs2877 


cadherin 3, type 1, P-cadherin (placenta 


6.30 




439366 


AF100143 


Hs.6540 


fibroblast growth factor 13 


6.30 


60 


452789 


AW081626 


Hs.242561 


ESTs 


6.30 


416836 


D54745 


Hs.80247 


cholecystokinin 


6.30 




436962 


AW377314 


Hs.5364 


DKFZP564I052 protein 


629 




433383 


AF034837 


Hs.192731 


double-stranded RNA specific adenosine d 


629 




418636 


AW749855 




gb:QV4-BT0534-281299^)53-c05 BT0534 Homo 


626 


65 


450728 


AW162923 


HS25363 


preseoOln 2 (Alzheimer disease 4) 


625 


440293 


AI004193 


Hs221?3 


ESTs 


624 




453745 


AA952989 


Hs.63908 


hypothetical protein MGC14726 


624 




426595 


AW971980 


Hs.62402 


p21/Cdc42/Rac1-activat8d kinase 1 (yeast 


624 




444412 


AI147652 


Hs216381 


Homo sapiens clone HH409 unknown mRNA 


624 




413384 


NM.000401 


Hs.75334 


exostoses (multiple) 2 


622 



147 



WO 02/30268 



426320 


W47595 


Hs.169300 


transforming growth factor, beta 2 


423349 


AF010258 


Hs.127428 


homeoboxAS 


429165 


AW009686 


Hs. 118253 


prostate cancer associated protein 1 


424600 


AL035568 


Hs.153203 


MyoD family inhibitor 


409564 


AA045857 


Ms.54943 


fracture caltus 1 (rat) homolog 


438796 


W67821 


Hs.109590 


genethonin 1 


425451 


AF242769 


Hs.157461 


mesenchymal stem eel) protein DSC54 


451663 


AI872360 


Hs.209293 


ESTe 


413623 


AA825721 


Hs.246973 


ESTs 


452232 


AW020603 


Hs.271698 


radial spoke protein 3 


453390 


AA862496 


Hs.28482 


ESTs 


435542 


AA687376 


Hs.269533 


ESTs 


420424 


AB033036 


Hs.97594 


KIAA1210 protein 


407103 


AA424881 


Hs.256301 


hypothetical protein MGC13170 


409734 


BE161664 


Hs.56155 


hypothetical protein 


432686 


BE223007 


HS.152460 


Homo sapiens cDNA FU 12909 fis, done NT 


438361 


AA805666 


Hs.146217 


Homo sapiens cDNA: FU23077 fis, done L 


411479 


AW848047 




gb:ILMT0214-291299-Q52-A12 CT0214 Homo 


438849 


W28948 


Hs.10762 


ESTs 


452726 


AF188527 


Hs.61661 


ESTs, Weakly similar to AF1746G5 1 F-box 


445895 


D29954 


Hs.13421 


KIAA0056 protein 


440774 


AI420611 


Hs.127832 


ESTs 


422583 


AA410506 


Hs.1 18578 


KIAA0874 protein 


427500 


AW970017 


Hs.293948 


ESTs, Weakly similar to S65657 atpha-1C- 


443646 


AI085198 


Hs.288699 


ESTs 


410566 


AA373210 


Hs.43047 


Homo sapiens cONA FU 13585 fis, done PL 


417845 


AL1 17461 


Hs.82719 


Homo sapiens mRNA; cDNA DKFZp586F1822 (f 


430273 


AI311127 


Hs.125522 


ESTs 


434792 


AA649253 


Hs.132458 


ESTs 


442490 


AW965078 


Hs^0212 


thyroid receptor interacting protein 15 


420026 


AI831190 


Hs.166676 


ESTs 


437782 


AI370876 


Hs.123163 


exportin 1 (CRM1, yeast, homolog) 


447359 


NMJJ12093 


Hs.18268 


adenylate kinase 5 


447713 


AW20733 


HS207083 


ESTs 


451073 


AI758905 


Hs.206063 


ESTs 


451640 


AA195601 


Hs.26771 


Human DNA sequence from done 747H23 on 


410889 


X91662 


Hs.66744 


twist (Drosophila) homolog (acrocephatos 


441222 


AI277237 


H6.44208 


hypothetical protein FU23153 


447732 


AI758398 


Hs.161318 


ESTs 


437756 


AA767537 


Hs.197086 


ESTs 


408829 


NM.006042 


Ks.48384 


heparan sulfate (glucosamine) 3-O-suifot 


453911 


AW503857 


Hs.4007 


Sarcolemrnal-assodated protein 


414085 


AA1 14016 


HS75746 


aldehyde dehydrogenase 1 family, member 


408875 


NM-015434 


Hs.48604 


DKFZP434B168 protein 


439451 


AF086270 


Hs.278554 


heterochromatin-iike protein 1 


423853 


AB011537 


Hs.133466 


sift (Drosophila) homolog 1 


453060 


AW294092 


Hs.21594 


hypothetical protein MGC 15754 


420407 


AA814732 


Hs.145010 


Gpopotysaccarlde-spedfic response 5-11 


450480 


X82125 


Hs.25040 


zinc finger protein 239 


408446 


AW450669 


Ks.45068 


hypothetical protein 0KFZp434l143 


421039 


NM_003476 


Hs,101299 


cuUin5 


451684 


AF216751 


Hs-26813 


CDA14 


436063 


AK000028 


H&250867 


ribosomal protein S24 


410507 


AA355288 


HS^71408 


transitional epfthetla response protein 


420179 


N74530 


Hs.21168 


ESTs 


453878 


AW964440 


Hs.19025 


OC32 


452270 


AW975014 


Hs£6 


ferrochelatase (protoporphyria) 


435867 


AA954229 


Hs.1 14052 


ESTe 


417683 


AW566008 


Hs.239154 


ankyrin repeat, family A (RFXANK4ike), 


432005 


AA524190 


Hs.120777 


ESTs, Weakly similar to ELL2.HUMAN RNA P 


406815 


AA833930 


Hs.288036 


tRNA isopentenylpyrophosphate transferas 


437980 


R50393 


HS.278436 


KIAA1474 protein 


425856 


AA364908 


Hs.98927 


hypothetical protein RJ13993 


400301 


X03635 


Hs.1657 


estrogen receptor 1 


446261 


AA313893 


Ks.13399 


hypothetical protein FU12615 similar to 


410141 


R07775 


Hs.287657 


Homo sapiens cONA: FU21291 fis, done C 


427258 


AA400091 


Hs.39421 


ESTs 


419108 


AA389724 


Hs.191264 


ESTs, WeaWy similar to ALU7JWMAN AW S 


442029 


AW956698 


HS.14456 


neural precursor ceQ expressed, develop 
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407783 


AW996872 


Hs.172028 


434408 


AI031771 


Hs.132586 


415077 


U1607 


Hs.934 


432435 


BE218886 


Hs.282070 


433313 


W20128 


Ks.296039 


431740 


N75450 


Hs.183412 


412991 


AW949013 




418852 


BE537037 


Hs.273294 


418882 


NM-004996 


Hs.83433 


446887 


AB007891 


Hs. 16349 


437866 


AA1 56781 


Hs.63992 


410232 


AW372451 


Hs.61184 


414452 


AA454038 


Hs.29032 


422762 


AL031320 


Hs. 119976 


428730 


AA625947 


HS25750 


431571 


AW500486 


H8.160610 


433393 


AF038564 


Hs.98074 


450616 


AL133067 


HS25214 


443774 


AL117428 


Hs.9740 


446100 


AW967109 


Hs.13804 


419168 


AI336132 


Hs.33718 


416653 


AA768553 


Hs.77496 


452679 


Z42387 


Hs.4299 


450244 


AA007534 


Hs.125062 


408621 


AI970672 


Hs.46633 


450325 


AI935962 


Hs.26289 


439671 


AW1 62840 


Hs.6641 


452387 


A1680772 


Hs.4316 


413992 


W26276 


Hs.136075 


444151 


AW972917 


Hs.128749 


417791 


AW965339 


Hs.111471 


410186 


AI936442 


Hs.59838 


415123 


D60925 




429170 


NM 001394 


Hs.2359 


434415 


BE177494 




440738 


AI004650 


Hs.225674 


443830 


AI142095 


Hs.143273 


449603 


AI655662 


Hs.197698 


414342 


AA742181 


Hs.75912 


422634 


NM 016010 


Hs. 118821 


435047 


AA454985 


Hs.54973 


400268 






452055 


AI377431 


Hs.293772 


437073 


AJ885608 


Hs.94122 


434072 


H70854 


Hs£83059 


418339 


AA639902 


Hs.104215 


434551 


BE387162 


Hs.280858 


439569 


AW602166 


Hs.222399 


441102 


AA973905 


Hs. 16003 


448310 


AI480316 




413173 


BE076928 


Hs.70980 


436246 


AW450963 


Hs. 119991 


449300 


AI656959 


Hs.222165 


452823 


AB012124 


Hs.30696 


451403 


AA885569 


Hs.15727 


417061 


AI675944 


Hs.188691 


429126 


AW172356 


Hs.99083 


431316 


AA502663 


Hs.145037 


439192 


AW970536 


Hs.105413 


431938 


AA938471 


Hs. 115242 


451552 


AA047233 


Hs.33810 


416991 


N36389 


Hs.295091 


427638 


AA406411 


Hs.208341 


427718 


AI798680 


Hs.25933 


438710 


AA833907 


Hs.178724 


406076 


AL390179 


Hs.137011 


431263 


AW129203 


Hs.13743 


421264 


AL039123 


Hs.103042 


421685 


AF189723 


Hs.106778 



a dlsintegrin and metalloprotemase doma 
ESTs 

glucosamlnyi (N-aoetyl) transferase 2, 1 

ESTs 

ESTs 

ESTs, Moderately similar to AF1 16721 67 
gb^}V4-FT0005-110500-201-e12 FT0005 Homo 
hypothetical protein RJ20069 
ATP-blnding cassette, sub-family C (CFTR 
KJAA0431 protein 
metallothtonetn 1E (functional) 
CGI-79 protein 
ESTs 

Human DNA sequence from clone RP1-20N2 o 
ESTs 

splicing factor prollne/gtutamlne rich { 
itchy (mouse homolog) E3 ubiquitin prote 
hypothetical protein 
DKFZP434A236 protein 
hypothetical protein dJ462023.2 
Homo sapiens cDNA RJ12641 fis, clone NT 
metallothionein 1E (functional) 
transmembrane, prostate androgen Induced 
ESTs 

chromosome 11 open reading frame 8 
ESTs 

ktnesfn family member 5C 
trinucleotide repeat containing 12 
RNA, U2 small nuclear 
alpha-methylacyl-CoA racemase 
ESTs 

hypothetical protein FU 10808 
ESTs 

dual specificity phosphatase 4 

gb:RC6-HTQ598-27Q300-01 1-C05 HT0596 Homo 

WD repeat domain 9 

ESTs 

ESTs 

KIAA0257 protein 
CGl-62 protein 
cadherin-like protein VR20 

hypothetical protein MGC10858 
ESTs 

Homo sapiens PRO1082 mRNA, complete cds 
ESTs, Moderately similar to SPCN_HUMAN S 
ESTs, Highly similar to A35661 DNA excis 
CEQP1 protein 

intermediate filament protein syncoffin 

gbim26h09jc1 Soares.NRJT.QBC.SI Homos 

ESTs 

ESTs 

ESTs 

transcription factor-like 5 (basic helix 

Homo sapiens cDNA FU1451 1 fis, clone NT 

Homo sapiens cDNA FU12033 fis, clone HE 

ESTs 

ESTs 

ESTs 

specific granule protein (28 kDa); cyste 
ESTs 

KtAA0228gene product 

ESTs, Weakly similar to KIAA0989 protein 

ESTs 

ESTs, Weakly similar to ALUIJiUMAN ALU S 
Homo sapiens mRNA; cDNA DKFZp547P134 (fr 
ESTs 

mlcrotubule-assoctated protein 18 
ATPase, Ca++ transporting, type 2C, memb 



149 



WO 02/30268 



408460 


AA054726 


Hs.285574 


409091 


AW970386 


Hs.269423 


421987 


AI133161 


Hs.286131 


426002 


AA418703 




441217 


A1922183 


Hs.213248 


426006 


R49031 


Hs.22627 


422806 


BE314767 


Hs.1581 


432281 


AK001239 


Hs.274263 


451882 


F13036 


Hs£7373 


421129 


BE439899 


Hs.89271 


444042 


NM_004915 


Hs.10237 


410150 


AW382942 


Hs.6774 


423952 


AW877787 


Hs.136102 


452822 


X85689 


Hs.288617 


447752 


M73700 


Hs.347 


441766 


R53790 


HS23294 


431359 


AW993522 


Hs.292934 


427212 


AW293849 


Hs£8279 


449916 


T60525 


Hs.299221 


454014 


AW01667O 


HS233275 


419714 


AA758751 


Hs.98216 


428845 


AL157579 


Hs.153610 


417333 


AL157545 


H&42179 


419986 


Al 345455 


Hs.78915 


407182 


AA312551 


Hs.230157 


420111 


AA255652 




428058 


AI821625 


Hs.191602 


459551 


AI472608 




432524 


AI458020 


Hs.293287 


436207 


AA334774 


Hs.12845 


410870 


U81599 


Hs.66731 


451418 


BE387790 


Hs98369 

1 IO*WWVV 


409757 


NM 001898 


Hs. 1231 14 


441124 


T97717 


Hs 119563 

1 lw> 1 1 www 


428593 


AW207440 

0\j ikU • "I'm 


Hs 185973 

1 Id* 1 Uvv i V 


436401 


A1087958 


Hs.29088 


437113 


AA744693 




*Www*T/ 


AI745400 


H&5Q4662 






Hs.53698 


AAZACT7 


AI239832 
MI£wwVw£ 


Hs 15617 


448944 


AB014605 


Hs.22599 


412198 


AA937111 


Hs 69165 


HttOW 


nor ooo 


Hs.151380 

» 1w> lw 1 www 


A3RQRR 


txrVQQQQQ 


Hs 969307 


4MQCL4 


AW1 18338 

tVI I 1 1 OwwV 


Hs.75251 

riw>f «NM ■ 






Hs.18800 




MMUfc 1 #Uw 


Hs 170434 




AW977286 

AVV9 f / £Ov 


Hs 169531 

riwi 1 Vwwv 1 


AOQAAi 


AJ224172 


H&2O4096 


AOAJ1QQ 




He 151791 
noi lw ifwi 


H&rOw? 


Awn?n7ftp 


He 70881 

flw>f wDO 1 




AI499Q()1 


Hq 146162 




AFwwl 


na*ut35w 




7ACQQQ 


He 


459055 


N23235 




431318 


AA502700 


HS293147 


452953 


AI932884 


HS271741 


428372 


AK000684 


Hs.183887 


434401 


AI864131 


Hs.71119 


416434 


AW163045 


Hs.79334 


410268 


AA316181 


Hs.61635 


417517 


AF001176 


Hs.82238 


453616 


Nl^.003462 


Hs.33846 


427958 


AA418000 


Hs.93280 


407945 


X69208 


Hs.606 


425154 


NM_001851 


Hs.154850 


412863 


AA121673 


Hs£9757 


420807 


AA2806Z7 


HS57846 


430568 


AA769221 


HS270847 



ESTs 
ESTs 

CGI-101 protein 

gb:zv98c03.s1 Scares JihHMPu_S1 Homo sapi 

ESTs 

ESTs 

glutathione ^transferase theta 2 

hypothetical protein RJ10377 

Homo sapiens mRNA; cONA DKFZp56401763 (f 

ESTs 

ATP-blnding cassette, sub-family G (WHIT 
ESTs 

KIAAQ353 protein 
hypothetical protein FU 22621 
factotransferrln 
hypothetical protein FU14393 
ESTs 

ESTs. Weakly similar to ALU7JHUMAN ALU S 
pyruvate dehydrogenase kinase, Isoenzyme 
ESTs 
ESTs 

KIAA0751 gene product 
bromodomain and PHD finger containing, 3 
GA-binding protein transcription factor, 
ESTs 

gb:zs21h11.f1 NCLCGAPJ3CB1 Homo sapiens 
ESTs 

gb^70e07j(1 SoaresJYSFJ*JW__OT..PAJl.S 
ESTs 

hypothetical protein MGC13159 
homeobox813 
hypothetical protein RJ20287 
cystatln SN 
ESTs 

degenerative spermatocyte (homolog Droso 
ESTs 

gb:ny26c10.s1 NCLCGAP GCB1 Homo sapiens 

ESTs 

ESTs 

ESTs, Weakly similar to ALU4_HUMAN ALU S 
atrophin-1 Interacting protBln 1;actM 
ESTs 

ESTs, Weakly similar to T16584 hypoM 
ESTs 

DEAD/H (Asp-Qlu-Ala-Asp/His) box binding 
hypothetical protein FU20281 
Homo sapiens cONA BJ14242 fis, clone OV 
RBP1-Bke protein 

lipophilin B (uteroglobin family member) 

KIAA0092 gena product 

Homo sapiens cONA: FU23006 fis, clone L 

ESTs 

Homo sapiens cONA RJ 10632 fis, clone NT * 
Homo sapiens mRNA; cONA DKFZp761l1912 (f 
ESTs, Weakly similar to B34087 hypotheti 
ESTs, Moderately simitar to A46010 X-lin 
ESTs, Weakly similar to A46O10 X-finked 
hypothetical protein FU22104 
Putative prostate cancer tumor suppresso 
nuclear factor, interleukin 3 regulated 
six transmembrane epithelial antigen of 
P0P4 (processing of precursor , S. cerev 
dynein, axonemat, light Intermediate pot 
potassium intermediats/smatl conductance 
ATPase, Cu++ transporting, alpha polypep 
collagen, type IX, alpha 1 
zinc finger protein 281 
ESTs 

defta-tubulin 
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433687 


AA743991 




gb:ny57g01.s1 NCLCGAP_Pr18 Homo sapiens 


438375 


AW01594O 


Hs.232234 


ESTs 


418092 


R45154 


Hs.106604 


ESTs 


418576 


AW968159 


HS.289104 


Alu-bindlng protein with zinc finger dom 


413328 


Y15723 


Hs.75295 


guanyiate cyclase 1, soluble, alpha 3 


414271 


AK000275 


Hs.75871 


protein kinase C binding protein 1 


432729 


AK000292 


Hs.278732 


hypothetical protein RJ20285 


433433 


AI692623 


Hs.121513 


Homo sapiens clone 73-1 placenta expres 


439662 


H97552 


Hs.269060 


ESTs 


439743 


AL389956 


Hs.283858 


Homo sapiens mRNA full length insert cDN 


417511 


AL049176 


Hs.82223 


chordin-fike 


437814 


AI088192 


Hs.135474 


ESTs, Weakly similar to DDX9.HUMAN ATP-D 


426342 


AF093419 


Hs.169378 


muttiple PDZ domain protein 


429782 


NM.005754 


Hs.220689 


Ras-GTPase-activating protein SH3-domain 


429975 


AI167145 


Hs.165538 


ESTs 


436209 


AW850417 


Hs.254020 


ESTs, Moderately similar to unnamed prat 


438571 


AW020775 


HS55022 


ESTs 


450223 


AM18204 


Hs.241493 


natural killer-tumor recognition sequenc 


408267 


AW380525 


Hs.267705 


tubulin-spectfic chaperons e 


417730 


Z44761 




gb:HSC28F061 normalized Infant brain cON 


425465 


U8964 


HS.1904 


protein kinase C.tota 


430599 


NM_004855 


Hs.247118 


phosphatidyfinositot gtycan, class B 


450961 


AW978813 


Hs£50867 


metaiiothfonein 1 E (functional) 


451386 


ABO290O6 


Hs.26334 


spastic paraplegia 4 (autosomal dominant 


420380 


AA640891 


Hs.102406 


ESTs 


424947 


R77952 


Hs.239625 


ESTs, Weakly similar to alternatively sp 


442653 


BE269247 


Hs.170226 


gb:601185486F1 NIH.MGC_8 Homo sapiens cC 


457211 


AW972565 


Hs.32399 


ESTs, Weakly similar to S51797 vasodilat 


425851 


NM_001490 


Hs.159642 


gluoosaminyl (N-acetyi) transferase 1 , c 


446279 


AA490770 


Hs.182382 


ESTs 


433377 


A1752713 


Hs.43845 


ESTs 


450218 


R02018 


Hs.168640 


ankylosis, progressive (mouse) homolog 


412715 


NM 000947 


Hs.74519 


primase, polypeptide 2A (58kD) 


448164 


R61680 


Hs.26904 


ESTs, Moderately similar to Z195.HUMAN Z 


420121 


AW968271 


HS.191534 


ESTs, Weakly similar to ALU1JHUMAN ALU S 


421689 


N87820 


Hs.106826 


K1AA1696 protein 


445808 


AV655234 


Hs^98083 


ESTs, Moderately similar to PC4259 ferri 


416533 


BE244053 


Hs.79362 


retinoblastoma-like 2 (p130) 


418049 


AA211467 


HS.190488 


Homo sapiens, Similar to nuclear localiz 


436039 


AWQ23323 


Hs.121070 


ESTs 


432653 


N62096 


Hs.293185 


ESTs, Weakly similar to JC7328 amino ad 


420324 


AF163474 


Hs.96744 


prostate androgen-regulated transcript 1 


403047 








436899 


AA764852 


H&291567 


ESTs 


431117 


AF003522 


H&250500 


delta (Drosophila)-fike 1 


427617 


D42063 


Hs.179825 


RAN binding protein 2 


428604 


AK000713 


Hs.193736 


hypothetical protein FU207O6 


433050 


AI093930 


Hs.163440 


Homo sapiens cDNA: RJ21000 fis, clone C 


418575 


AA225313 


H&222886 


ESTs, Weakly similar to TRHYJiUMAN TRICH 


432615 


AA557191 


H&55028 


ESTs, Weakly similar to I54374 gene NF2 


412652 


AI801777 


Hs.6774 


ESTs 


432473 


A1202703 


Hs.152414 


ESTs 


449071 


NW_005872 


Hs.22960 


breast carcinoma amplified sequence 2 


450654 


AJ245587 


Hs.25275 


Kruppel-type zinc finger protein 


418866 


T65754 


Hs.100489 


gb:yd 1c07.s1 Stratagene lung (937210) H 


407596 


R86913 




gb:yq30f05.r1 Soares fetal liver spleen 


456516 


BE172704 


Hs£22746 


KIAA1610 protein 


426501 


AW043782 


Hs.293616 


ESTs 


448730 


AB032983 


Hs.21894 


KIAA1157 protein 


458339 


AW976853 


Hs.1 72843 


ESTs 


422083 


NM.001141 


Hs.111256 


arachldonate 15-0poxygenase, second typ 


420159 


AI572490 


Hs.99785 


Homo sapiens cDNA: FU21245 fis, clone C 


424103 


NKL001918 


Hs.139410 


dOiydrolipoamide branched chain transacy 


449535 


W15267 


Hs.23672 


low density lipoprotein receptor-related 


422048 


NM.012445 


Hs.288126 


spondin 2, extracellular matrix protein 


416737 


AF154335 


Hs.79691 


UM domain protein 


419972 


AL041465 


Hs.294038 


gotgin*67 


420235 


AA256756 


H3.31178 


ESTs 


423412 


AF109300 


HS.147924 


prostate cancer associated protein 5 
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429598 


AA811257 


Hs.269710 


ESTs 


4.80 


457114 


AI821625 


Hs.191602 


ESTs 


4.80 


421828 


AW891965 


Hs.289109 


hlstone deacetylase 3 


4.79 


424602 


AK002055 


Hs.301129 


hypothetical protein FIJ1 1 193 


4.78 


428364 


AA426565 


Hs.160541 


ESTs, Moderately similar to ALU1_HUMAN A 


4.78 


452335 


AW188944 


Hs.61272 


ESTs 


4.78 


410765 


AI694972 


Hs.66180 


nudeosome assembly protein Mike 2 


4.77 


421040 


AA715026 


Hs.135280 


ESTs 


4.76 


421518 


AI056392 


Hs.208819 


ESTs 


4.76 


452560 


BE077084 




ESTs 


4.76 


409752 


AW963990 




gb:EST376063 MAGE resequences, MAQH Homo 


4.75 


439703 


AF086538 


Hs.196245 


ESTs 


4.75 


418836 


AI655499 


Hs.161712 


ESTs 


4.74 


450642 


R39773 


Hs.7130 


coplne IV 


4.74 


419879 


Z17805 


Hs,93564 


Homer, neuronal Immediate early gene, 2 


4.74 


411440 


AW749402 




gb:QV4-BT0383-281299-061^6 BT0383 Homo 


4.74 


450649 


NMJJ01429 


Hs.297722 


E1 A binding protein p300 


4.74 


408738 


NM_014785 


Hs.47313 


KIAA0258 gene product 


4.73 


435020 


AW505076 


Hs.301855 


OiGeorge syndrome critical region gene 8 


4J2 


411624 


BE145964 




KIAA0594 protein 


4.72 


439360 


AA448488 


Hs.55346 


rtbosomat protein L44 


4.72 


440491 


R35252 


Hs.24944 


ESTs, Weakly similar to 21 09260A B cell 


4.72 


442611 


BE077155 


HS.177537 


hypothetical protein DKFZp761B1514 


4.72 


443555 


N71710 


Hs.21398 


ESTs, Moderately simitar to A Chain A, H 


4.72 


453800 


BE300741 


Hs.2884'16 


hypothetical protein FU13340 


4.72 


457528 


AW973791 


Hs.292784 


ESTs 


4.72 


416795 


AI497778 


Hs.168053 


HBV pX associated protein-8 


4.71 


407302 


R74206 


Hs.268755 


ESTs, Weakly similar to 178885 serine/th 


4.71 


404721 








4.70 


426261 


AW242243 


Hs.168670 


peroxisomal famesylated protein 


4.70 


431924 


AKOO0850 


Hs.272203 


Homo sapiens cONA RJ20843 fis, clone AO 


4.70 


435255 


AF193768 


Hs.13872 


cytokine-tike protein C1 7 


4.70 


438295 


A1394151 


Hs.37932 


ESTs 


4.70 


442655 


AW027457 


Hs.30323 


ESTs, Weakly simffar to B34087 hypothet! 


4.70 


415788 


AW628686 


Hs.78851 


KIAA0217 protein 


4.69 


442760 


BE075297 


Hs.10067 


ESTs, Weakly similar to A43932 mucin 2 p 


4.69 


432432 


AA541323 


Hs. 115831 


ESTs 


4.68 


454398 


AM63437 


Hs. 11556 


Homo sapiens cDNA FU 12566 fis, clone NT 


4.68 


452741 


BE392914 


Hs.30503 


Homo sapiens cDMA FU1 1344 fis, done PL 


4.67 


424853 


BE549737 


Hs.132967 


Human EST clone 122887 mariner transpose 


4.67 


419706 


C04649 


Hs.77899 


tropomyosin 1 (alpha) 


4.66 


412088 


AI689496 


Hs.108932 


ESTs 


4.65 


416276 


U41060 


Hs.79136 


UV-1 protein, estrogen regulated 


4.64 


429281 


AA830856 


Hs.29808 


Homo sapiens cONA: FU21 122 fis, clone C 


4.64 


448207 


AI475490 


Hs.170577 


ESTs 


4.64 


408374 


AW025430 


Hs.155591 


forkheadboxFI 


4.64 


447162 


BE328091 


Hs.157396 


ESTs, Weakly similar to A46010 X-finked 


4.64 


451900 


AB023199 


Hs.27207 


KIAA0982 protein 


4.63 


421437 


AW821252 


Hs.104336 


hypothetical protein 


4.63 


418624 


A1734080 


Hs. 104211 


ESTs 


4.63 


426172 


AA371307 


Hs.125056 


ESTs 


4.62 


439331 


AW136488 


Hs.25545 


ESTs 


4.61 


452994 


AW962597 


Ks.31305 


KIAA1547 protein 


4.61 


457726 


AI217477 


Hs.194591 


ESTs 


4.60 


434629 


AA769081 


KS.4G29 


glioma-amplified sequence-41 


4.60 


403764 








4.58 


410659 


AI080175 


Hs.68826 


ESTs 


4.58 


432383 


AK000144 


Hs.274449 


Homo sapiens cONA FU20137 fis, clone CO 


4.58 


451246 


AW189232 


Hs.39140 


cutaneous T-cell lymphoma tumor antigen 


458 


433234 


AB040928 


Hs.65366 


KIAA1495 protein 


457 


424983 


AI742434 


Hs. 1699 11 


ESTs 


4.56 


437812 


A1582291 


Hs.16846 


ESTs, Weakly similar to 04HUD1 debrisoqu 


456 


438447 


AI082B83 


Hs.167593 


hypothetical protein FU13409; KIAA1711 


455 


434715 


BE005346 


Hs.1 16410 


ESTs 


455 


447673 


AI823987 


Hs.182265 


ESTs 


454 


408697 


N50204 


Hs.283709 


lipopotysaccharide specific response-7 p 


454 


436645 


AWQ23424 


Ks.156520 


ESTs 


4.54 


421247 


BE391727 


Hs.102910 


general transcription factor l!H, polype 


4.53 


450377 


AB033091 


Hs.24936 


KIAA1265 protein 


4.53 
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433644 AW342028 Hs256112 gb:hb75d03j(1 NCL_CGAP_Ut2 Homo sapiens 4.53 

408321 AW405882 Hs.44205 oortistatin 4.53 

439225 AA192669 Hs.45032 ESTs 4.52 

440348 AW015B02 Hs.47023 ESTs 4.52 

5 446351 AW444551 Hs258532 x 001 protein 432 

451212 AW9Q2672 Hs287334 ESTs 432 

430294 AI538226 Hs.135184 guanine nucleotide binding protein 4 432 

435005 U80743 Hs.4316 trinucleotide repeat containing 12 432 

448072 A1459306 Hs24908 ESTs 430 

10 403721 4.50 

451018 AW965599 H&247324 mitochondrial ribosomal protein S14 4.50 

453070 AK001465 Hs.31575 SEC63, endoplasmic reticulum translocon 4.49 

417412 X16898 Hs.82112 tnterieukln 1 receptor, type I 4.48 

439735 A1635386 Hs.142846 hypothetical protein 4.48 

1 5 435663 AI023707 Hs.134273 ESTs 4.48 

424036 AA770688 Hs.81946 H2A histone family, member L 4.48 

426386 AA748850 Hs.174877 bladder cancer overexpressed protein 4.48 

408622 AA056060 Hs202577 Homo sapiens cONA FU12166 fis, clone MA 4.47 

444269 AI590346 Hs.146220 ESTs 4.47 

20 430187 A1799909 Hs.158989 ESTs 4.46 

427761 AA412205 Hs.140996 ESTs 4.46 

430261 AA305127 HS237225 hypothetical protein HT023 4.46 

444169 AV648170 Hs38756 ESTs 4.44 

430598 AK001764 Hs247112 hypothetical protein FU1 0902 4.44 

25 412903 BE007967 Hs.155795 ESTs 4.44 

417048 A1088775 Hs35488 geranylgeranyl diphosphate synthase 1 4.44 

442710 AI015631 HS23210 ESTs 4.44 

457413 AA743462 Hs.165337 ESTs 4.44 

400303 AA242756 Hs.79136 LIV-1 protein, estrogen regulated 4,42 

30 443268 AI800271 Hs.129445 hypothetical protein FLJ12496 4.42 

438209 AL120659 Hs.6111 aryl-hydrocarbon receptor nuclear transl 4.42 

431724 AA514535 Hs283704 ESTs 4.41 

412280 AW205116 Hs272814 hypothetical protein DKFZp434E1723 4.40 

440801 AA906366 Hs.190535 ESTs 4.40 

35 452959 AI933416 Hs,189674 ESTs 4.40 

453861 AI026838 Hs30120 ESTs, Weakly similar to NUCLJWMAN NUCLE 4.40 

417421 AL138201 Hs32120 nuclear receptor subfamily 4, group A, m 4.40 

447270 AC002551 Hs.331 general transcription factor IIIC, polyp 438 

433641 AF080229 gb:Human endogenous retrovirus K clone 1 4.38 

40 447078 AW885727 Hs301570 ESTs 438 

424242 AA337476 hypothetical protein MQC13102 4.37 

408170 AW204516 Hs.31835 ESTs 436 

448757 AI366784 Hs.48820 TATA box binding protein (TBP)-assoclate , 436 

420021 AA252848 Hs293557 ESTs 436 

45 449694 A1659790 Hs2533Q2 ESTs 436 

453867 A1929383 Hs.108196 hypoftetical protein DKFZp434N185 436 

458712 A1347502 Hs.173066 hypothetical protein RJ20761 436 

417251 AW015242 Hs39488 ESTs, Weakly similar to YK54JTEAST HYPOT 435 

434423 NMJXJ6769 Hs.3844 UM domain onfy 4 435 

50 423427 AL137612 Hs285848 KIAA1454 protein 434 

415715 F30364 ESTs 433 

404561 432 

422969 AA782536 Hs. 122647 N-myrlstoyttransferase 2 - 432 

423685 BE350494 Hs.49753 uveal autoanfigen with colled coil domat 432 

55 443977 AL120988 Hs.150627 ESTs, Weakly similar to I38022 hypotherj 4.32 

425071 NMJJ13989 Hs. 154424 detodlnase, iodothyronine, type II 4.32 

431583 AL042613 Hs262476 S*adenosyImethionine decarboxylase 1 431 

411379 AI816344 Hs.12554 ESTs, Weakly similar to NPL4J1UMAN NUCLE 4.30 

421476 AW953805 Hs21887 ESTs 4.30 

60 425178 H16097 Hs.161027 ESTs 4.30 

439262 AA832333 Hs.124399 ESTs 4.30 

442818 AK001741 Hs.8739 hypothetical protein FU 10879 4.30 

421977 W94197 Hs. 110165 ribosomal protein L26homolog 429 

437114 AA836641 Hs.163085 ESTs 428 

65 420195 N44348 Hs300794 Homo sapiens CDNARJ11 177 fo, clone PL 428 

418330 BE409405 Hs.94722 ESTs 427 

419750 AL079741 Hs.183114 Homo sapiens cDNA RJ14236 fte, done NT 426 

437065 AL036450 Hs.103238 ESTs 426 

455276 BE176479 gb:RC3-HT0585-1603(XM)22-b09 HT0585 Homo 424 
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416292 


AA179233 


Hs.42390 


nasopharyngeal carcinoma susceptibility 


423740 


Y07701 


Hs.132243 


amtnopeptidase puromycln sensitive 


442023 


AI187878 


Hs.144540 


ESTs 


426764 


AA732524 


HS.151464 


ESTs, Weakly similar to ALUC.HUMAN HI! 


454058 


AI273419 


Hs.135146 


hypothetical protein FU13984 


456511 


AA282330 


Hs.145668 


ESTs 


448330 


AL036449 


Hs.207163 


ESTs 


424701 


NMJJQ5923 


Hs.151988 


mitogen-activated protein kinase kinase 


432621 


AI29&501 


Hs.12807 


ESTs, Weakly similar to T46428 hypothec 


445707 


AI248720 


HS.1 14390 


ESTs 


419910 


AA662913 


Hs.190173 


ESTs, Weakly Mar to A46010 X-llnked 


424085 


NM_0G2914 


Hs. 139226 


replication factor C (activator 1) 2 (40 


440749 


W22335 


Hs.7392 


hypothetical protein MQC3199 


442787 


W93048 


Hs.227203 


hypothetical protein MGC2747 


443414 


R54594 


Hs.25209 


ESTs 


443556 


AA256769 


Hs.94949 


methytmatonyl-CoA epimerase 


444170 


AW613879 


Hs.102408 


ESTs 


446751 


AA766998 


Hs.85874 


Human DNA sequence from clone RP1 1-16L21 


421041 


N36914 


Hs.14691 


ESTs, Moderately similar to I38022 hypot 


447476 


BE293466 


Hs.20880 


ESTs, Weakly similar to I38022 hypotheti 


448543 


AW897741 


Hs.21380 


Homo sapiens mRNA; cDNA DKFZp586P1 124 (f 


410294 


AB014515 


Hs.288891 


KIAA06 15 gene product 


433607 


AA602004 


Hs.23260 


ESTs 


435552 


AI668636 


Hs.193480 


ESTs, Moderate [y similar to ALU6JHUMAN A 


447124 


AW976438 


Hs.17428 


RBPMike protein 


453308 


AW959731 


Hs.32538 


ESTs 


439328 


W07411 


Hs.1 18212 


ESTs, Moderately similar to ALU3 HUMAN A 


430473 


AW130690 


Hs.299842 


ESTs 


437257 


A1283085 


Hs.290931 


ESTs, Weakly similar to YFJ7.YEAST HYPOT 


438018 


AK001160 


Hs.5999 


hypothetical protein RJ10298 


443857 


AI089292 


Hs.287621 


hypothetical protein RJ 14069 


446711 


AF169692 


Hs.12450 


protocadherin9 


419103 


Z40229 


Hs.96423 


hypothetical protein FU23033 


405403 








407378 


AA299264 




ESTs, Moderately simitar to (38022 hypot 


408986 


AW2986Q2 


Hs.197687 


ESTs 


418727 


AA227609 


Hs.94834 


ESTs 


434400 


AI478211 


Hs.186896 


Homo sapiens cONA RJ1 1417 fis, clone HE 


438576 


AA811244 


Hs.164168 


ESTs 


450459 


AI697193 


H&299254 


Homo sapiens cDNA: FU23597 fis, clone L 


429887 


AW366286 


Hs.145698 


spiiring factor (CC1.3) 


448148 


NM.016578 


Hs£0509 


HBV pX associated protefn-8 


450316 


W84446 


Hs.17850 


hypothetical protein MQC4643 


417531 


NM.003157 


Hs.1087 


serine/threonine kinase 2 


431592 


R69016 


Hs£93871 


hypothetical protein MQC1 0895s 


432463 


AA548518 


Hs.186733 


ESTs 


433613 


AA836126 


Hs£669 


ESTs 


434739 


AA804487 


Hs.144130 


ESTs 


438259 


AW205969 


Hs.131808 


ESTs 


425810 


AI923627 


Hs.31903 


ESTs 


432672 


AW973775 


H&.130760 


myosin phosphatase, target subuntt 2 


433345 


AI681545 


Hs.152982 


hypothetical protein RJ131 17 


432712 


AB016247 


HS288031 


sterot-CS-desaturase (fungal ERG3, delta 


453020 


AL 162039 


H&31422 


Homo sapiens mRNA; cDNA DKFZp434M229 (fr 


412045 


AA099802 


Hs.4299 


transmembrane, prostate androgen induced 


435114 


AA775483 


8^288936 


mitochondrial ribosomal protein 19 


443204 


AW205878 


H&29643 


Homo sapiens cDNA FU13103 fis, clone NT 


445459 


AI478629 


Hs.158465 


likely ortholog of mouse putative IKK re 


438938 


H46212 


Hs.137221 


ESTs 


454119 


BE549773 


Hs.40510 


uncoupling protein 4 


411000 


N40449 


HS201619 


ESTs, Weakly similar to S38383 SEB4B pro 


418926 


AA232658 


Hs*7070 


UDP-fllucose:glycoprotein glucosyltransfe 


424432 


AB037821 


Hs.146858 


protocadherin 10 


449873 


AA002064 


Hs.18920 


ESTs 


429299 


A1620463 


Hs.99197 


hypothetical protein MGC13102 


422174 


AL049325 


Hs.1 12493 


Homo sapiens mRNA; cONA DKFZp564D036 (fr 


455497 


AA1 12573 


Hs.285691 


Homo sapiens prostein mRNA, complete cds 


415138 


C18356 


Hs.78045 


tissue factor pathway Inhibitor 2 


402791 
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426792 


AL044854 


Hs.172329 


KIAA0576 protein 


4.04 




438660 


U95740 


Hs.6349 


Homo sapiens, clone IMAGE30 10666, mRNA, 


4.04 




442768 


AL048534 


Hs.48458 


ESTs, Weakly similar to ALU8_HUMAN ALU S 


4.04 




447568 


AF155655 


HS.18885 


CGM 16 protein 


4.04 


5 


428342 


AE739168 


Hs.131798 


Homo sapiens cONA FU13458 fts, clone PL 


4.04 




453439 


AI572438 


Hs.32976 


guanine nucleotide binding protein 4 


4.02 




453857 


AL080235 


Hs.35861 


DKFZP586E1621 protein 


4.02 




428249 


AA130914 


Hs. 183291 


zinc finger protein 268 


4.02 


10 


432015 


AL157504 


Hs.159115 


Homo sapiens mRNA; cDNA DKFZp588O0724 (f 


4.02 


445495 


BE622641 


HS.384B9 


ESTs, Weakly similar to 138022 hypotheti 


4.02 




451746 


M86178 




ESTs 


4.02 




452211 


AI985513 


HS233420 


ESTs 


4.02 




453046 


AA284040 


H8519441 


ESTs, Highly similar to CA5BLHUMAN CARBO 


4.02 


15 


456038 


AA203285 


HS294141 


ESTs, Weakly similar to alternatively sp 


4.02 


452449 


AW068658 


H&20943 


ESTs 


4.02 




407204 


R41933 


Hs.140237 


ESTs, Weakly similar to ALU1JWMAN ALU S 


4.01 




428046 


AW812795 


Hs.155381 


ESTs, Moderately simitar to I38022 hypot 


4.01 




438520 


AA706319 


Hs£8416 


ESTs 


4,01 


20 


443292 


AK000213 


Hs.9196 


hypothetical protein 


4.01 


432715 


AA247152 


Hs£00483 


ESTs, Weakty similar to K1AA1074 protein 


4.00 




403797 








4.00 




418347 


AA216419 


Hs.269295 


gb:nc16e03.s1 NCLCGAP_Pr1 Homo sapiens 


4.00 




419459 


AW291128 


Hs.278422 


DKFZP586G1 122 protein 


4.00 


25 


420911 


U77413 


Hs.100293 


O-linked N-acetylglucosamine (GlcNAc) tr 


4.00 


425176 


AW015644 


Hs.301430 


TEA domain family member 1 (SV40 transcr 


4.00 




447505 


AUM9266 


Hs.18724 


Homo sapiens mRNA; cONA DKFZp564F093 (fr 


4.00 




453773 


AL133761 




gbDKFZp761C1413_rl 761 (synonym: hamy2) 


4.00 




434384 


AA631910 


Hs.162849 


ESTs 


3.99 


30 


422471 


AA311027 


HS271894 


ESTs, Weakly similar to 138022 hypotheti 


3.99 


427386 


AW836261 


Hs.177486 


ESTs 


3.98 




433394 


AI907753 


Hs.93310 


cerebral cavernous malformations 1 


3.98 




441269 


AW015206 


Hs.178784 


ESTs 


3.97 




419629 


AB020695 


Hs,91662 


KIAA0888 protein 


3.96 


35 


435008 


AF150262 


Hs.162898 


ESTs 


3.96 


456649 


R74441 


Hs.1 17176 


po!y(A>-bindIng protein, nuclear 1 


3.96 




418723 


AA504428 


Hs.10487 


Homo sapiens, clone IMAGE:3954132, mRNA, 


3.96 




428738 


NM_000380 


Hs.192803 


xeroderma pigmentosum, complementation g 


3.95 




430456 


AA314998 


K&241503 


hypothetical protein 


3.% 


40 


422017 


NOG3877 


Hs.1 10776 


STAT induced STAT tnhtbitor-2 


3.95 


409960 


BE261944 


H5.153Q28 


hexoklnase 1 


3.95 




455309 


AW894017 




gb:RC4-NN0027-15040(K)12-fl04 NN0027 Homo 


3.95 




450295 


AI766732 


Hs.201194 


ESTs 


3.94 




456660 


AA909249 


Hs.112282 


solute carrier family 30 (zinc transport 


3.94 


45 


410908 


AA121686 


Hs.10592 


ESTs 


3.94 


447145 


AA761073 


Hs.192943 


TRAF family member-associated NFKB activ 


3.94 




449318 


AW236021 


Ks.108788 


Homo sapiens, Similar to R1KEN cDNA 5730 


3.94 




449869 


W57990 


Hs.60059 


Homo sapiens cONA FU1 1478 fis, clone HE 


3.94 




411887 


AW182924 


Hs.128790 


ESTs 


3.93 


50 


437531 


AI400752 


Ms.112259 


T cell receptor gamma locus 


3.93 


452238 


F01811 


Hs.187931 


ESTs 


3.93 




410486 


AW235094 


Hs.193424 


zinc finger protein 


3.92 




424882 


AI379461 


Ks.153636 


far upstream element (FUSE) binding prot 


3.92 




426269 


H15302 


Ks.168950 


Homo sapiens mRNA; cDNA DKFZp566A1046 [t 


3.92 


55 


427043 


AA397679 


Hs.298480 


ESTs 


3.92 


440404 


AI015881 


Hs.125616 


mitochondrial ribosomal protein 85 


3.92 




452762 


AW501435 


HS.171409 


v-akt murine thymoma viral oncogene homo 


3.92 




453058 


AW612293 


Hs.288684 


Homo sapiens cONA FU11750 fis, clone HE 


3.92 




423583 


AL122055 


Hs.129636 


KIAA1028 protein 


3.92 


60 


408001 


AA046458 


Hs.95296 


ESTs 


3.92 


419197 


N48921 


Hs.27441 


KIAA1615 protein 


3.91 




426695 


AI355647 


Hs.189999 


purinerglc receptor (family A group 5) 


3.91 




401747 








3.91 




410011 


AB020641 


HS57856 


PFTAIRE protein kinase 1 


3.91 


65 


432205 


AI806583 


Hs.125291 


ESTs 


3.91 


447857 


AA081218 


Hs.58608 


Homo sapiens cONA HJ14208 fis, ctone NT 


3.91 




446494 


AA463276 


Hs.288906 


WW DomairvContainlng Gene 


3.91 




409928 


AL137163 


Hs.57549 


hypothetical protein dJ473B4 


3.90 




411598 


6E336654 


Hs.70937 


H3 histone family, member A 


3.90 




424790 


All 19344 


Hs.13326 


ESTs, Weakly simitar to 2004399A chromos 


3.90 
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425707 


AF1 15402 


Hs.11713 


E74-fike factor 5 (eta domain transcript 


3.90 




431325 


AW026751 


Hs.5794 


ESTs. Weakly similar to 2 109260 A B cell 


3.89 




451806 


NM003729 


Hs.27076 


RNA ^-terminal phosphate cyclase 


3.89 




401045 








3.89 


5 


433023 


AW864793 


HS.34161 


thrombospondin 1 


3.89 




452160 


BE378541 


Hs.278815 


cysteine sulfinic acid decarboxylase-rel 


3.89 




437372 


AA323968 


Hs.283631 


hypothetical protein DKFZp5476183 


3.89 




417067 


AJ001417 


Hs.81086 


solute carrier (amity 22 (extraneuronal 


3.88 


10 


410467 


AF1Q2546 


Hs.63931 


dachshund (Drosophila) homolog 


3.88 


422660 


AW297582 


Hs.237062 


hypothetical protein FU22548 similar to 


3.88 




431930 


AB035301 


HS272211 


cadherin 7, type 2 


3.88 




453047 


AW023798 


Hs.286025 


ESTs 


3.88 




433891 


AA613792 




gb;no97h03.s1 NCI_CGAP„Pr2 Homo sapiens 


3.88 


15 


401785 








3.88 


431088 


AA491824 


Hs.196881 


ESTs 


3.88 




451952 


AL120173 


Hs.301663 


ESTs 


3.87 




422089 


AA523172 


Hs.103135 


ESTs, Weakly similar to SFR4JHUMAN SPUC 


3.87 




452277 


AL049013 


Hs.28783 


KIAA1223 protein 


3.87 


20 


438279 


AA805166 


Hs.165165 


HIV-1 rev binding protein 2 


3.86 


458229 


AI929602 


Hs.177 


phosphatidylinositol glycan, class H 


3.86 




406414 








3.86 




417193 


AI922189 


Hs.288390 


hypothetical protein FU22795 


3.85 




413174 


AA723564 


Hs.191343 


ESTs 


3.85 


25 


433332 


AI367347 


Hs.127809 


Homo sapiens clone TCCCTA00151 mRNA sequ 


3.85 


411089 


AA456454 


Hs.1 18637 


cell division cycle Mike 1 (PfTSLRE pr 


3.85 




412494 


AL133900 


Hs.792 


ADP*ribosyiatlon factor domain protein 1 


3.84 




413530 


AA130158 


Hs.19977 


ESTs, Moderately similar to ALU8_HUMAN A 


3.84 




459592 


AL037421 


HS208746 


ESTs, Moderately similar to pot ORF I [ 


3.84 


30 


418329 


AW247430 


Hs.84152 


cystathionine-beta-synthase 


3.83 


451468 


AW503398 


Hs.210047 


ESTs, Moderately similar to I38022 hypot 


3.83 




434804 


AA649530 




gb:ns44f05.s1 NCLCGAP _Afv1 Homo sapiens 


3.83 




401819 








3.82 




424179 


F30712 




Homo sapiens, clone IMAGE:4285740, mRNA 


3.82 


35 


424850 


AA151057 


Hs.153498 


chromosome 18 open reading frame 1 


3.82 


426472 


BE246138 


Hs.30853 


ESTs 


3.82 




426625 


T78300 


Hs.171409 


serologically defined colon cancer antig 


3.82 




427585 


D31152 


Hs.179729 


collagen, type X, alpha 1 (Schmld metaph 


3.82 




427756 


AI376540 


Hs.15574 


ESTs 


3.82 


40 


444701 


AI916512 


Hs.198334 


ESTs 


3.82 


423052 


M28214 


Hs.123072 


RAB3B, member RAS oncogene family 


3.82 




429259 


AA420450 


Hs.292911 


ESTs, Highly simitar to S60712 band-6-pr 


3.82 




416111 


AA033813 


Hs.79018 


chromatin assembly factor 1 , subuntt A ( 


3.82 




433586 


T85301 




gb:yd78d06.s1 Soares fetal liver spleen 


3.81 


45 


438527 


AI969251 


Hs.143237 


RAB7, member RAS oncogene family-tike 1 


3.81 


410297 


AA148710 


Hs.159441 


lumlcan 


331 




429898 


AW1 17322 


Hs.42366 


ESTs 


3.81 




409079 


W87707 


Hs.82065 


interieukfn 6 signal transducer (gp130, 


3.80 




419423 


026488 


Hs.80315 


KIAA0007 protein 


3.80 


50 


429643 


AA455889 


Hs.187548 


FYVE-finger-contatnlng Rab5 effector pro 


3.60 


431499 


NMJD01514 


Hs.258561 


general transcription factor IIB 


3.60 




445060 


AA830811 


Hs.88808 


ESTs 


3.80 




449419 


R34910 


Hs.1 19172 


ESTs 


3.80 




450584 


AA04O4O3 


Hs.60371 


ESTs 


3.80 


55 


426137 


ALQ40683 


Hs.167031 


DKFZP566D1 33 protein 


3.79 


420185 


AL044056 


Hs.158047 


ESTs 


3.79 




410076 


T05387 


H 3,7991 


ESTs 


3.78 




444078 


BE246919 


Hs.10290 


U5 snRNP-spectftc 40 kDa protein (hPrpS- 


3.78 




417318 


AW953937 


Hs.12891 


ESTs 


3.78 


60 


414664 


AA587775 


Hs.66295 


muIti-PDZ-dornain-contalnlng protein 


3.78 


410275 


U85658 


Hs.61786 


transcription factor AP-2 gamma (actlvat 


3.77 




410503 


AW975746 


Hs.188662 


KIAA1702 protein 


3.77 




434170 


AA626509 


Hs.122329 


ESTs 


3.77 




421838 


AW881089 


Hs.1 08806 


Homo sapiens mRNA; cDNA DKFZp566M0947 (f 


3.77 


65 


425268 


A1807883 


Hs.1 56932 


Homo sapiens cDNA FU20653 fis, clone KA 


3.76 


431696 


AA259068 


Hs.267819 


protein phosphatase 1, regulatory (inhib 


3.76 




411990 


AW963624 


Hs.31707 


ESTs, Weakly similar to YEW4.YEAST HYPOT 


3.76 




430291 


AV660345 


Hs.238126 


CG1-49 protein 


3.76 




448779 


BE042877 


Hs.177135 


ESTs 


3.76 




452682 


AA456193 


Hs.155606 


progesterone membrane binding protein 


3.75 
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65 



452598 


Af 331594 


Hs.68647 


ESTs, Weakfy similar to ALU7JWMAN ALU S 


3,75 


439498 


AAS08731 


Hs.58297 


CLLL8 protein 


3.75 


440258 


A1741633 


Ks.125350 


ESTs 


3.74 


456848 


AL121087 


Hs.296406 


K1AA0685 gene product 


3.74 


415082 


AA1 60000 


Hs.137396 


ESTs, Weakly similar to JC5238 galactosy 


3.74 


420853 


AI224532 


Hs.88550 


ESTs 


3.74 


431637 


AI879330 


HS.2659S0 


hypothetical protein RJ10563 


3.74 


440411 


N30256 


Hs.156971 


hypothetical protein DKFZp434G1415 


3.74 


405917 








3.74 


419440 


AB020689 


Hs.90419 


KIAA0882 protein 


374 


451230 


BE546208 


Hs.26090 


hypothetical protein FU20272 


3.73 


429597 


NNL003816 


Hs.2442 


a disintegrin and metaltoproteinase doma 


3.73 


430144 


AI732722 


Hs.187694 


ERGL protein; ERGIC-53-iike protein 


372 


438394 


BE379623 


HS27693 


pepfidylprolyl isomerase (cydophllin)-! 


372 


440527 


AV657117 


Hs.184164 


ESTs, Moderately similar to S65657 alpha 


372 


449433 


A1672096 


Hs.9012 


ESTs, Weakly simitar to S26650 DNA-blndi 


372 


456228 


BE503227 


HS.134759 


ESTs 


372 


448663 


BE614599 


Hs.106823 


hypothetical protein MGC14797 


3.72 


415075 


L27479 


Hs.77889 


Friedreich ataxia region gene X123 


3.72 


433544 


AI793211 


Hs.165372 


ESTs, Moderately similar to ALU1.HUMAN A 


371 


418293 


AI224483 


Hs.16063 


hypothetical protein FU21877 


371 


449897 


AW819642 


Hs.24135 


transmembrane protein vezatin; hypoflieti 


371 


420297 


AI628272 


Hs.88323 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


370 


423065 


R96158 


Hs.194606 


Homo sapiens, clone MGC5406, mRNA, comp 


370 


429340 


N35938 


Hs.199429 


Homo sapiens mRNA; oONA DKFZp434M2216 (f 


370 


437777 


AA768098 


Hs.189079 


ESTs 


370 


440351 


AF030933 


Hs.7179 


RAD1 (S. pombe) homolog 


370 


443603 


BE5Q2601 


Hs.134289 


ESTs, Weakly similar to KIAA1063 protein 


370 


446965 


BE242873 


Hs.16677 


WO repeat domain 15 


3.70 


412350 


AI659306 


HS.73826 


protein tyrosine phosphatase, non-recept 


370 


433852 


AI378329 


Hs.126629 


ESTs 


3.70 


433142 


AL120697 


Hs. 110640 


ESTs 


3.69 


419994 


AA282881 


Hs.190057 


ESTs 


3.69 


412628 


AI972402 


Hs.173902 


hypothetical protein MGC2648 


3.-69 


431416 


AA532718 


Hs.178604 


ESTs 


3.69 


439444 


AI277652 


Hs54578 


ESTs, Weakly similar to 138022 hypotheti 


3.68 


414709 


AA704703 


Hs.77031 


Sp2 transcription factor 


3.68 


447397 


BE247676 


Hs.18442 


E-1 enzyme 


3.68 


405718 








3.68 


425217 


AU076696 


Hs.155174 


CDC5 (cell division cycle 5, S. pombe, h 


3.68 


442242 


AV647908 


Hs.90424 


Homo sapiens cDNA: FU23285 fis, clone H 


3.68 


424690 


BE538356 


Hs.151777 


eukaryotic translation Initiation factor 


3.68 


421734 


A1318624 


Hs.107444 


Homo sapiens cDNA FU20562 fis, clone KA 


3.67 


427221 


L15409 


Hs.174007 


von HippeHJndau syndrome 


3.67 


439864 


AI720078 


Hs.291997 


ESTs, Weakly similar to A47582 B-celi gr 


3.66 


402408 








3.66 


426327 


W03242 


Hs.44898 


Homo sapiens clone TCCCTA00151 mRNA sequ 


3.66 


427119 


AW880562 


Hs.114574 


ESTs 


3.66 


427356 


AW023462 


Hs.97849 


ESTs 


3.66 


452946 


X95425 


Hs.31092 


EphAS 


3.66 


419078 


M93119 


Hs,69534 


insutinoma-assocfated 1 


3.66 


416295 


AI064824 


Hs.193385 


ESTs 


3.65 


427144 


X95097 


Hs.2126 


vasoactive intestinal peptide receptor 2 


3.65 


447500 


AI381900 


Hs.159212 


ESTs 


3.65 


453127 


AI696671 


Hs.294110 


ESTs 


3.65 


423398 


AI382555 


Hs.127950 


bromodomain-containlng 1 


3.65 


419346 


AI830417 




polybromol 


3.64 


441540 


C01387 


H3.127128 


ESTs 


3.64 


446501 


AI302616 


Hs.150819 


ESTs 


3.64 


459527 


AW977556 


Hs.291735 


ESTs, Weakly similar to 178885 serina/th 


3.63 


446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 


435706 


W31254 


HS7045 


GL004 protein 


3.63 


400110 








3.62 


410313 


R10305 


Hs.185683 


ESTs 


3.62 


414713 


BE465243 


Hs.12664 


ESTs 


3.62 


436279 


AW900372 


Hs.180793 


ESTs, Weakly similar to S65657 e(pha-1C- 


3.62 


439818 


AL360137 


Hs.19934 


Homo sapiens mRNA fufl length Insert cON 


3.62 


451797 


AW663658 


Hs.56120 


smail inducible cytokine subfamily E, me 


3.62 


451294 


AI457338 


Hs.29894 


ESTs 


3.62 
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25 
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50 



55 



60 



65 



434194 


AF1 19847 


Hs^83940 


Homo sapiens PRO1550 mRNA, partial cds 


3.62 


404939 








3.62 


408101 


AW968504 


Hs.123073 


CDC2-retated protein kinase 7 


3.62 


435846 


AA700870 


Hs.14304 


ESTs 


3.61 


432833 


N51075 


H&47191 


ESTs 


3.61 


427276 


AA400269 


Hs.49598 


ESTs 


3.61 


433495 


AW373784 


Hs.71 


atpha-2-gfycoproteh 1, zinc 


3.60 


403137 








3.60 


404165 








3.60 


409571 


AA504249 


Hs.187585 


ESTs 


3.60 


410561 


6E540255 


Hs.6994 


Homo sapiens cDNA: RJ22044 fls, clone H 


3.60 


412924 


BE018422 


Hs.75258 


H2A hlstone family, member Y 


3.60 


434228 


Z42047 


Hs,283978 


Homo sapiens PR02751 mRNA, complete cds 


3.60 


436797 


AA731491 


Hs.178518 


hypothetical protein MGC14879 


3.60 


437162 


, AW005505 


Hs.5464 


thyroid hormone receptor coactlvatfng pr 


3.60 


437444 


H46008 


Hs.31518 


ESTs 


3.60 


404210 








339 


446157 


BE270828 


Hs.131740 


Homo sapiens cDNA: FU22562 fls t done H 


359 


437587 


A1591222 


Hs.122421 


Human DNA sequence from clone RPM87J11 


3.58 


423147 


AA987927 


Hs.131740 


Homo sapiens cDNA: FU22562 fis, clone H 


3.57 


452226 


AA024898 


Hs.296002 


ESTs 


3.56 


443775 


AF291664 


Hs.204732 


matrix metaltoprotelnase 26 


3.56 


452501 


AB037791 


Hs.29716 


hypothetical protein FU 10980 


3.56 


428647 


AA830050 


Hs.124344 


ESTs 


3.56 


422443 


NM 014707 


Hs.116753 


hlstone deacetylase 76 


355 


447966 


AA340605 


Hs.105887 


ESTs, Weakfy similar to Homolog of rat Z 


3.55 


420892 


AW975076 


Hs.172589 


nuclear phosphoprotein similar to S. cer 


355 


420230 


AL034344 


Hs.298020 


forkhead box CI 


3.55 


418428 


Y12490 


Hs.85092 


thyroid hormone receptor interactor 11 


3.54 


428949 


AA442153 


Hs.104744 


hypothetical protein DKFZp434J0617 


354 


444929 


AI685841 


Hs.161354 


ESTs 


354 


433339 


AF019226 


Hs.8036 


glioblastoma overexpressed 


3.54 


424369 


R87622 


Hs.26714 


KIAA1831 protein 


3.54 


433002 


AF048730 


Hs.279906 


cydinTI 


353 


435425 


H16263 


Hs.31416 


ESTs 


353 


415621 


AI6486Q2 


Hs. 131 189 


ESTs 


353 


416974 


AF010233 


Hs.80667 


RALBP1 associated Eps domain containing 


3.63 


405793 








352 


409770 


AW499536 




gb:UWF-BR0p-aik>12-(HJI.r1 NIHJ/IGC.5 


352 


425305 


AA363025 


Ks.155572 


Human clone 23801 mRNA sequence 


352 


428939 


AW236550 


Hs.131914 


ESTs 


352 


438388 


AA806349 


Hs.44698 


ESTs 


352 


443703 


AV646177 


Hs-213021 


ESTs 


3.52 


457940 


AL360159 


Hs.30445 


Homo sapiens TRtpartite motif protein ps 


352 


402444 








352 


409643 


AW450866 


Hs.257359 


ESTs 


351 


418250 


U29926 


Ks.83918 


adenosine monophosphate deaminase (isofo 


351 


432745 


AI821926 


Hs.269507 


gb:nt78f05jc5 NCI_CGAP_Pr3 Homo sapiens 


351 


414222 


AL135173 


Hs.878 


sorbitol dehydrogenase 


3.51 


430061 


AB037817 


Hs.230188 


KIAA1396 protein 


351 


421491 


H99999 


Hs.42738 


ESTs 


3.50 


422384 


AA224077 


Hs.42438 


Sm protein F 


350 


434565 


T52172 




ESTs 


3.50 


438379 


N23018 


Hs.171391 


C-terminal binding protein 2 


350 


439741 


BE379646 


Hs.6904 


Homo sapiens mRNA full length insert cON 


350 


447311 


R37010 


Hs.33417 


Homo sapiens cDNA: FU22806 fis, done K 


350 


447805 


AW627932 


Hs.19614 


gemfn4 


350 


454265 


H03556 


Hs.300949 


ESTs, Weakly similar to thyroid hormone 


350 


418838 


AW385224 


Hs.35198 


ectonudeotide pyrophosphatasa/phosphodi 


3.50 


448804 


AW512213 


Hs.42500 


ADP-ribosylation factor-Ste 5 


3.50 


409617 


BE003760 


Hs.55209 


Homo sapiens mRNA; cONA DKFZp434K0514 (f 


3.49 


434075 


AW003416 


Hs.160604 


ESTs 


3.49 


444190 


A1878918 


Hs.10526 


cysteine and glycine-rich protein 2 


3.49 


435017 


AA336522 


Hs.12854 


angiotensin ii, type I receptor-assodat 


3.48 


423445 


NM.014324 


Hs.128749 


alpha-methylacyi-CoA racemase 


3.48 


420271 


AI954365 


Hs.42692 


ESTs 


3.48 


4436B4 


AI681307 


Hs.166674 


ESTs 


3.48 


444168 


AW379879 




gb:RC1-HTQ256-081199-011-fQ1 HT0256Homo 


3.48 


446074 


AA079799 


Hs.29263 


hypothetical protein FU11896 


3.48 
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452582 


AU37407 


Hs.29911 


Homo sapiens mRNA; cONA DKFZp434M232 (fr 


3.48 


431542 


H63010 


Hs.5740 


ESTs 


3.48 


432697 


AW975050 


Hs.293892 


ESTs, Weakly similar to ALU4_HUMAN ALU S 


3.48 


435572 


AW975339 


Hs.239828 


ESTs, Weakly similar to QAQ2 J1UMAN RETRO 


3.47 


407192 


AA609200 




gb»f12e02.s1 SoaresJestis.NHT Homo sap 


347 


413435 


X51405 


Hs.75360 


carboxypeptldase E 


3.48 


447210 


AF035269 


Hs.17752 


phosphatldylserlne-spedfic phospholipas 


3.46 


447958 


AW796524 


H8.68644 


Homo sapiens microsomal signal peptidase 


3.46 


425312 


AA354940 


Hs.145958 


ESTs 


3.46 


442007 


AA301115 


Hs.142838 


nucleolar phosphoproteln Nopp34 


346 


417455 


AW007066 


Hs. 18949 


ESTs, Weakly similar to CA2BJ4UMAN COLLA 


345 


426931 


NM003416 


Hs.2076 


zinc finger protein 7 (KOX 4, clone HF.1 


3.45 


408739 


W01556 


Hs.238797 


ESTs, Moderately similar to 138022 hypot 


3.45 


436024 


AI800041 


Hs.190555 


ESTs 


3.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


345 


409151 


AA306105 


Hs.50785 


SEC22, vesicle trafficking protein (S. c 


3.44 


418626 


AW299508 


Hs.135230 


ESTs 


3.44 


420560 


AW207748 


Hs.59115 


ESTs 


3.44 


420686 


AI950339 


Hs.40782 


ESTs 


3.44 


428870 


AA436831 


Hs.36049 


ESTs 


3.44 


436754 


At061288 


Hs.133437 


ESTs 


3.44 


437960 


AI669586 


Hs.222194 


ESTs 


3.44 


452300 


AW628045 


Hs.28896 


Homo sapiens mRNA full length insert cDN 


3.44 


421887 


AW161450 


Hs.109201 


CQI-86 protein 


3.44 
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TABLE 5A shows the accession numbers for those primekeys lacking a unigenelD in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: 

CAT number: 
Accession: 



Unique Eos probeset Identifier number 

Gene cluster number 

Genbank accession numbers 



Pkey 

407596 
408432 
409752 
409770 
411440 
411479 

411624 
412991 
414269 
415123 
415715 
416288 
416289 
417730 
416636 
419346 
419536 
420111 
422219 
424179 
424242 
428002 
429163 
432189 
432340 
432363 
432966 
433586 
433641 



433687 
433891 
434415 
434565 
434804 
437113 
444168 
448212 
448310 
451746 



CAT number 

1003489.1 

1058667J 

115301J 

1154048 J 

124577J 

1247077J 

1252166.1 

134248J 

143133J 

1523390.1 

1548818J 

1585983.1 

1586037.1 

1695795.1 

177402.1 

184129.1 

185688.1 

190755J 

213547J 

236389J 

237181J 

285602J 

300543.1 

34281 9 J 

345248.1 

345469 1 

356839.1 

370470.1 

37186.1 



373061.1 
376239.1 
J 
1 

393481J 
433234.1 
593829J 
755099J 
757918J 
.1 



Accession 

R86913 R86901 H25352 R01370 H43764 AW044451 W21298 
AW1 95262 R27868 AW811262 

AW963990 AA078196 AW749482 AAQ77468 BE151571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW5Q2136 AW501743 
AW749402 AW749403 Z45743 R80376 AA093358 

AW848047 AW848202 AW848631 AW848142 AW848702 AW848121 AW848632 AW848140 AW848571 

AW848009 AW848067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298489AA137165 

060925 D60828 D80787 

F30364 F36559T15435 

H51299 H44619 H46391 R86024 H51892 T72744 

W26333 R05358H44682 

Z44761 R25801 R11926 R35604 

AW749855 AA225995 AW750208 AW750206 

AI830417AA236612 

AA603305 AA244095 AA244183 

AA255652 AA280911 AW967920 AA262684 

AW978073 AW978072 AA807550 AA306567 

F30712 F35665 AW263888 AI904014 AI904018 AA336927 AA336502 

AA337476 AW966227 AA450376 AW950222 AA381051 

AA418703AA418711 BE071915BE071920BE071912 

AA884766 AW974271 AA592975 AA447312 

AA527941 AI810608 AI620190AA635266 

AA534222AA632632T81234 

AA534489 AW970240 AW970323 

AA650114 AW974148 AA572946 

T85301 AW517087 AA601054 BE073959 

AF08Q229 AF080231 AF08Q230 AF080232 AF08Q233 AF080234 BE550633 AI636743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U67593 US7590 U87591 S46404 U87587 AA463992 AW206802 AI970376 
A1583716 AI572574 N25695 AW665466 AI818326 AA128128 AI480345 AW013827 AA248638 AI214968 
AA204735 AA207155 AA206262 AA204833 AW003247 AW496808 AI080480 Al 63 1703 AI651023 AI867418 
AW818140 AA502500 A1206199 AI671282 AJ352545 BE501030 AI652535 BE465762 AA206331 AW451866 
AM71088 AA206342 AA204834 AA208100 AW021661 AA332922 N66048 AA703398 H92278 AW139734 
H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE46661 1 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 T05304 AW858385 
BE177494 AW276909 AA632849 
T52172AF147324 T52248 
AA649530 AA659316 H64973 
AA744693AW750059 
AW379879Ai126285H12014 
AI475858AW969013 
AI480316AW847535 
M86178AI813822D56993 
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452560 


922216J 


BE077084 AW139963 AW863127 AW806209 AW806204 AW806205 AW806206 AW80621 1 AW806212 






AW806207 AW806208 AW606210 A1907497 


452712 


928309,1 


AW838616 AW838660 BE144343 AI914520 AW888910 BE184854 BE184784 


453773 


980699J 


AL133761 AL133767 


455276 


1272541J 


BE176479 BE176678 BE176357 BE176550 AW886079 BE176676 BE176615 BE176555 BE176489 BE178810 






BE176362 


455309 


1278153J 


AW894017 AW893956 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et aT refers to the 

publication entitled "The DNA sequence of human chromosome 22." Dunham I. et al. ( Nature (1999) 402:489495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposttion: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposttion 


401045 


8117619 


Plus 


90044-90184,91111-91345 


401424 


8176894 


Plus 


24223-24428 


401451 


6634068 


Minus 


119926-121272 


401714 


6715702 


Pius 


96484-96681 


401747 


9789672 


Minus 


118596-118816,119119-119244,119609- W 








131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


165776-165996,166189-166314,166408-166569,167112-167268,167387-167469,168634-168^ 


401819 


7467933 


Minus 


28217-28486 


402408 


9796239 


Minus 


■ 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


51036-51207 


403047 


3540153 


Minus 


59793-59983 

92349-92572,92958-93084^3579-93712,93949-94072,94591-94748,95214-95337 


403137 


9211494 


Minus 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


9926489 


Minus 


69025-69128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795980 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


404721 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Pius 


175318-175476 


405403 


6850244 


Minus 


37491-37670,4095141031 


405685 


4508129 


Minus 


37956-38097 


405718 


9795467 


Plus 


113080-113266 


405793 


1405887 


Minus 


89197-89453 


405876 


6758747 


Pius 


3969440031 


405917 


7712162 


Minus 


106829-107213 


406414 


9256407 


Plus 


4959349850 


406554 


7711566 


Plus 


106956-107121 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 



Pksy^ 




Unique Eos probeset Identifier number 




ExAccn: 




Exemplar Accession number, Genbank accession number 




UnlflenelD' 




Unigene number 






1 In In on a THIo* 
UHlyWlQ Hue. 




Unigene gene title 




R1: 




Ratio of tumor to normal tissue 




rK6y 


cxAccn 


1 Inlnnnntn 

unigeneiu 


uningene True 


rtl 


41/3001 


NM_UUObO<£ 


HS.544t6 


sine ocuiis homeobox (Drosophua) homolo 


4o2o 




AA10CQQC 


U. CGI AC 

113,00145 


tnymosin, Deta, laentiiieo in neuroblast 


AH OA 
40x4 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


AO AQ 


41U154 




Hs.95420 


JM27 protein 


Ai 19 
41.12 


426747 


A ACOCUrt 

AA535Z1U 


Hs.171995 


kaffikreln 3, (prostate specific antigen 


qi on 
31.80 


400299 




Hs.171995 


kallikrein 3, (prostate specific antigen 


9A 01 

24.31 


425U/5 


AACAftQOi 

M/\3UWc4 


Hs.1852 


add phosphatase, prostate 


OA 9Q 
24.20 


AOAOAD 

424840 


Al tf\T7*y>A 

AU0773Z4 


Hs.1832 


neuropeptide Y 


OO EH 


405685 








on on 
20.90 


420757 


V7QC/M 

A76592 


Hs.99915 


androgen receptor (dihydrotestosterone r 


10 TO 

18.72 


418994 


AA29o520 


Hs.89546 


selectin E (endothelial adhesion molecul 


19.56 


452792 


ABQ37765 


Hs.30652 


KIAA1344 protein 


17.39 


445472 


AB006631 


u A Ant OA 

hS.127o4 


U. mn „„L.„ —nil A *— - IS] A A A<mO mam* r\nr 

Homo sapiens mRNA tor KIAA0293 gene, par 


17.00 


414565 


AA502972 


Hs.183390 


hypothetical protein FU13590 


16.82 


431716 


089053 


HS268012 


fatty-acid-Coenzyrne A ligase, long-chain 


16.60 


408430 


S79876 


Hs.44926 


dipeptidytpeptidase IV (C026, adenosine 


1628 


408000 


L11690 


Hs.620 


bullous pemphigoid antigen 1 (230/240kD) 


15,54 


40U2£0 


DCZ4000& 


Hs2551 


adrenergic, beta-2-, receptor, surface 


iRAft 

\U.H\J 


444484 


AK002126 


Hs.11260 


hypothetical protein RJ11264 


14.76 


418601 


AA279490 


Hs.86368 


calmegin 


1456 


448999 


AF179274 


Hs22791 


transmembrane protein with EGF-fike and 


14,55 


416182 


NMJXM354 


Hs.79069 


cyciinQ2 


12.94 


420544 


AA677577 


Hs.98732 


Homo sapiens Chromosome 16 BAC clone CiT 


12.79 


445413 


AA151342 


Hs.12677 


CGI-147 protein 


12.64 


453930 


AA419466 


Hs^6727 


hypothetical protein RJ10903 


1222 


440286 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


12,04 


452764 


BE463857 


Hs.151258 


hypothetical protein FU21062 


11.86 


450203 


AF097994 


H3.301528 


L-kynurenfna/alpha-amjnoadipate aminotra 


11.68 


448045 


AJ297436 


Hs20166 


prostate stem cell antigen 


11.51 


449650 


AF055575 


Hs23838 


calcium channel, voltage-dependent, I ty 


11.18 


420381 


D50640 


Hs.337616 


phosphodiesterase 3B, cGMP-inhtoited 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein RJ10188 


11.08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11.08 


428728 


NMJJ16625 


Hs.191381 


hypothetical protein 


11.04 


407021 


U52077 




gb:Human mariner! transposase gene, comp 


11J02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


11J02 


452340 


NM.002202 


Hs.505 


ISL1 transcription factor, LIM/homeodoma 


10.85 


428819 


AL135623 


Hs.193914 


KIAA0575 gene product 


1048 


421991 


NM.014918 


Hs.110488 


KIAA0990 protein 


10.04 


431217 


NM.013427 


Hs250830 


Rho GTPase activating protein 6 


9.75 


421470 


R27496 


Hs.1378 


annextn A3 


9.64 


409262 


AK000631 


Hs.52256 


hypothetical protein FU20624 


9.45 


435980 


AF274571 


Hs.129142 


deoxyribonuciease II beta 


924 


421246 


AW582962 


Hs.102897 


CGI-47 protein 


920 


410001 


AB041036 


Hs.57771 


kallikrein 11 


9.03 


441791 


AW372449 


Hs.175982 


hypothetical protein RJ21159 


9.02 
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456497 


AW967956 


his. 123648 


419968 


X04430 


Hs.93913 


433172 


AB037841 


Hs. 102652 


422631 


BE21B919 


Hs.1 18793 


427674 


NM_003528 


Hs.2178 


404915 






452259 


AA317439 


Hs.28707 


452891 


N75582 


Hs.212875 


439731 


AI953135 


Hs.45140 


419839 


U24577 


Hs.93304 


420120 


AL049810 


Ks.95243 


424099 


AF071202 


Ks.139336 


448706 


AW2910S5 


Hs.21814 


410227 


AB 009284 


Hs.61152 


425211 


M18667 


Hs.1867 


441736 


AW292779 


Hs.169799 


419991 


AJ000098 


Hs.94210 


425016 


BE245277 


Ks.154196 


424560 


AA158727 


Hs.150555 


409110 


AA191493 


Hs.48778 


421566 


NM-000399 


Hs.1395 


431725 


X65724 


Hs.2839 


425782 


U664& 


Hs. 159525 


427408 


AA583206 


Hs.2156 


435604 


AA625279 


Hs .26892 


415874 


AF091622 


Hs.78893 


401451 






431778 


AL080276 


Hs.268562 


409089 


NM 014781 


Hs.50421 


431892 


NM 002742 




404253 






421552 


AFQ26692 


Hs 105700 


416806 


NM 000288 


Hs 79993 


431958 




Hs.2877 




AF100143 


Hs.6540 


416836 


D54745 


Hs 80247 


433383 


AF034837 


Hs 192731 

1 lOt 1 Ufa F ** 1 


450728 


AW1B2923 


Hs.25363 


413384 


mm Q00401 


Hs.75334 


491140 


AP010258 


Ha 127428 


424800 


AL035588 


Hs 153203 


425451 


AF242769 


Hs 157461 


447359 


NM 012093 


H3 18268 




YQ1662 

A9IOQC 


Hs.66744 


408820 


NM 008042 


Hs.48384 


/COQ1 I 




Hs.4007 


HwOl 0 


MM 015434 


Hs.48604 






Hs^5040 




A F2 18751 


Hs.26813 


400301 




Hs.1 657 


*\ IWf / 


1 41A07 


Hs.934 






Hs_2732fl4 




ARflA7flQ1 
nOWftf? 1 


He 18340 

MOa IWH5 


410232 


AW372451 


Hs.61184 


422762 


AL031320 


Hs.1 19976 


450816 


AL133067 


HS.3026B9 


408621 


AJ970672 


Hs.46638 


439671 


AW162840 


Hs.6641 


410196 


AI936442 


Hs.59833 


429170 


NM 001394 


Hs.2359 


440738 


A1004650 


Hs.225674 


414342 


AA742181 


Hs.75912 


422634 


NM-016010 


Hs.1 18821 


400268 






439569 


AW602166 


Hs.222399 


452823 


AB012124 


Hs.30696 


431938 


AA938471 


Hs.54431 


427638 


AA406411 


HS.208341 



ESTs, Weakly similar to AF108460 1 ublnu 
tnterleukin 6 (interferon, beta 2) 
hypothetical protein ASH1 
hypothetical protein FU10688 
H2B hlstone family, member Q 

signal sequence receptor, gamma (translo 
ESTs, Weakly similar to DYH9_HUMAN CIUA 
hypothetical protein RJ14084 
phosphoflpase A2, group VII (ptatetet-ac 
transcription elongation (actor A (Sll)- 
ATP-binding cassette, sub-family C (CFTR 
interteukJn 20 receptor, alpha 
exostoses (mu[tfple)-lfke 2 
progastdcsln (pepsinogen C) 
ESTs 

eyes absent (Drosophila) homofog 1 
E4F transcription factor! 
protein predicted by clone 23733 
niban protein 

early growth response 2 (Krox-20 (Drosop 
Nome disease (pseudoglioma) 
cell growth regulatory with EF-hand doma 
RAR-related orphan receptor A 
uncharacterlzed bone marrow protein BM04 
KIAA0244 protein 

regulator of G-protein signalling 17 
KIAA0203 gene product 
protein kinase C, mu 

secreted frizzled-related protein 4 
peroxisomal biogenesis factor 7 
cadherin 3, type 1, P-cadherin (placenta 
fibroblast growth factor 13 
cholecystokinin 

double-stranded RNA specific adenosine d 

presenllln 2 (Alzheimer disease 4) 

exostoses (multiple) 2 

homeo box A9 

MyoD family Inhibitor 

mesenchymal stem cell protein DSC54 

adenylate kinase 5 

twist (Drosophila) homolog (acrocephalos 
heparan sulfate (glucosamine) 3-0-sutfot 
Sarcotemmal-associated protein 
DKFZP434B168 protein 
zinc finger protein 239 
C0A14 

estrogen receptor 1 

gfucosamlnyl (N-acetyQ transferase 2, 1 
hypothetical protein FU20069 
KIAA0431 protein 
CGI-79 protein 

Human DNA sequence from clone RP1-20N2 o 

hypothetical protein 

chromosome 1 1 open reading frame 8 

klnesln family member 5C 

hypothetical protein RJ10803 

dual specificity phosphatase 4 

W0 repeat domain 9 

KIAA0257 protein 

CGI-62 protein 

CEGP1 protein 

transcription factor-like 5 (basic helix 
specific granule protein (28 kDa); cyste 
ESTs, Weakly similar to K1AA0989 protein 
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421264 


AL039123 


Hs.1 03042 


mterotubule-assodated protein 1B 


421685 


AF189723 


Hs.106778 


ATPase, Ca++ transporting, type 2C, memb 


421 957 


AI133161 


Hs.286131 


CGH01 protein 


422806 


BE314767 

WWW lliwi 


Hs.1581 


glutathione S-transferase thsta 2 


432281 


AK001239 


Hs.274263 


hypothetical protein FU) 10377 


451982 


F13036 


Hs.27373 


Homo sapiens mRNA; cDNA DKFZp56401763 (f 


444042 


NM.004915 


Hs.10237 


ATP-binding cassette, sub-family Q (WHIT 


447752 


M73700 


Hs. 105938 


lactotransfenin 


451418 


BE387790 


Hs.26369 


hypothetical protein FU20267 


428593 


AW207440 


Hs.185973 


degenerative spermatocyte (homolog Droso 


447541 


AK0Q0288 


Hs, 18800 


hypothetical protein RJ20281 


459294 


AW977286 


Hs.17428 


RBP1-iike protein 


424692 


AA429834 


Hs.151791 


KIAA0092 gene product 


416434 


AW163045 


Hs.79334 


nuclear factor, Interleukin 3 regulated 


410268 


AA316181 


Hs.61635 


six transmembrane epithelial antigen of 


417517 


AF001176 


Hs.82238 


P0P4 (processing of precursor , S. cerev 


453616 


IMM 003462 


Hs.33846 


dynein, axonemal, Bght intermediate pol 


427958 


AA4 18 000 


Hs.98280 


potassium intermediate/small conductance 


407945 


X69208 


Hs.606 


ATPase, Cu++ transporting, alpha polypep 


*HOOfO 




Hs.289104 


Alu*blncSng protein with zinc finger dom 




Y15723 


Hs 75295 


guanylate cyclase 1 , soluble, alpha 3 




AK000992 


Hs578732 


hypothetical protein FU20285 


426342 




Hs 169378 


multiple PDZ domain protein 




NM (105754 


Hs P9QB89 


Ras-QTPase*activatIng protein SH3-<Jomain 




AW850417 


Hs.254020 


ESTs, Moderately similar to unnamed prot 




MM fYVAQKC 
FiW^JUv'tvOO 


He. 247118 


nhosnhafidvtlnDsito! cWcan class B 


HQ 1300 


AR02900S 


Hs.26334 


silastic oaraDleoia 4 (autosomal dominant 


40/&11 


AWQ79KR.R 
MVV9r£ODO 


He ooogo 


ESTs Waakiv similar to S51797 vasodilat 


*t£O00 1 


NM 001490 


Hs.159642 


glucosamlnyl (N-acetyl) transferase 1, c 


4ZlO09 


Mft7ft0n 


no. lUDotD 


KIAA1698 nmtfiin 


410000 


D«'WU0J 


no./9i30£ 


rah'nobla5tQma<like 2 fn130) 


43&00O 


ptOcvsO 




ESTs Weakfv simSar to JC7328 amino aci 


4wWf 








401 1 If 




Hs 250500 


da Its (Drosophila)-like 1 


A07M7 




Hs.199179 

M3. 199 1 IO 


RAN bindina orotain 2 


AOQQClA 
44»tJU4 


nWAW/ IO 


Ue 1O07OA 


hvnnihfiticfll Drotoin FU2070B 


AAQfTTi 


KIM 00^79 


He 99QRO 


hroflct estdtiQTfts amnlrfiad saausnoa 2 

UIOCI9V vOIMIIUIlia CUlipilllQW OOkJWOIIWW & 


4U/090 


QOCQ-IO 

noO? 10 




ah*va30f05 ri Soarss fatal fiver sotaen 


4OO010 


RP1797AA 
0Cl/£/U4 


Ue 999748 


KIAA1610 Drotein 


400009 


nYVo/ODOO 


U Q 179RA0 

ns. 1 f <.o*to 


ESTs 


Aoonoo 


NM 001141 


Hs.1 11258 

1 19> III 


arachldonatB 15-Qpoxygenase, second typ 




\AHMft7 


Me 90A79 


Inw rinneMv lirmnrntflirt rflnpntnr-rfltatftd 


4£lU40 


MM MQAAG 


He 988196 


spondin 2, extracellular matrix protein 


424602 


a i/nrwnee 
AKUUZUOO 


ns.loiU40 


hunnthntlral nrntflln FLU 1 193 


a 4 mutt 


AICQX070 


u e caipft 


n<iHone/imA AceAmhtu nmtain 1-fIkfi 9 


4 lot) /a 


717flfl*i 
£ I /OUO 


He 00CRA 


Homer, neuronal Immediate early gene, 2 


450649 




U(> OC079 
nSJ:0£f& 


PI A hinriinn nmtnln rAOO 


41 1 024 


Otl40»04 


Ue mooao 
ns.luKOO 




4U4/&1 








426261 


AW24ZZ40 


lit! ieaR7n 


narnvlcnmfll ffllTIAevtfltnrl nmtflEll 
pol ojuoui i wii iciitiooyiaiou piuioui 


410Z7Q 


U41U0U 


Ue 701 OR 
nS.r9l00 


1 nrntain Acjmnan ipniilAtA/l 

wIV 1 UlwlCIII, iMUW^oll lOUUIOiOU 


408374 


AWQ25430 


Un 4 CCCfH 

ns. 1 55o91 


(OiKIloau Vva r l 


401 CW 


AR0941Q0 
nOUi:0l99 


He 97907 


KIAATJ982 Drotein 


421407 


AWR910C9 


He IHAOOA 


hunnrtifltical Drotein 




AA789081 

tv\l Q9VO 1 


Hs.4029 


gtioma-amplined sequence-41 


403764 








421247 


BE391727 


Hs.102910 


general transcription factor 1IH, polype 


403721 








453070 


AK001465 


H3.31575 


SEC63, endoplasmic reticulum translocon 


417412 


X16896 


Hs.82112 


interleukin 1 receptor, type 1 


439735 


AI635386 


Hs.142846 


hypothetical protein 


430261 


AA305127 


Hs^37225 


hypothetical protein HT023 


430598 


AK001764 


HS247112 


hypothetical protein FU10902 


400303 


AA242758 


H3.79136 


LIV-1 protein, estrogen regulated 


438209 


AL120659 


Hs.6111 


aryt-hydrocarbon receptor nuclear transl 


417421 


AL138201 


Hs.82120 


nuclear receptor subfamily 4, group A, m 


447270 


AC002551 


Hs.331 


general transcription factor IIIC, polyp 


434423 


NM_006769 


HS3844 


L1M domain only 4 


404561 
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5.00 
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4.84 

4.84 
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432 

4.82 
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4.74 

4.74 

4.72 

4.70 

4.70 

4.64 

4.64 

4.63 

4.63 
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433 
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4.48 

4.48 

4.46 
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4.42 
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4.35 
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422969 


AA782536 


Hs.122647 


N-myristoyltransferase 2 


4.32 


423685 


BE350494 


Hs.49753 


uveal autoantigen with colled coil domai 


432 


425071 


NM.013989 


H3.154424 


deiodinase, iodothyronlne, type II 


432 


431583 


AL042613 


Hs.262476 


S-adenosylmethlonlne decarboxylase 1 


431 


442818 


AK001741 


Hs.8739 


hypothetical protein FU 10879 


4.30 


423740 


Y07701 


Hs.293007 


aminopeptidase puromydn sensitive 


4.24 


424701 


NMJJ05923 


Hs.151988 


mitogen-activated protein kinase kinase 


421 


424085 


NM_002914 


Hs.139226 


replication factor C (activator 1)2 (40 


420 


410294 


AB014515 


Hs.323712 


KIAA0615 gene product 


4.18 


447124 


AW976438 


Hs.17428 


RBPMike protein 


4.18 


438018 


AK001160 


Hs.5999 


hypothetical protein FU10298 


4.16 


443857 


AI089292 


Hs£87621 


hypothetical protein FU 14069 


4.15 


446711 


AF169692 


Hs.12450 


protocadherin 9 


4.15 


405403 








4.14 


446148 


NMJJ16578 


Hs.20509 


HBV pX associated protein-8 


4.13 


417531 


NM_0G3157 


Hs.1087 


serine/threonine kinase 2 


4.12 


433345 


A1681545 


Hs.152982 


hypothetical protein FU131 17 


4.10 


432712 


AB016247 


Hs.288031 


sterol-C5-desaturase (fungal ERG3, delta 


4.09 


435114 


AA775483 


Hs.288936 


mitochondrial ribosomal protein L9 


4.08 


445459 


AI478629 


Hs.158465 


likely orthoiog of mouse putative IKK re 


4.08 


402791 








4.04 


438660 


U95740 


Hs.6349 


Homo sapiens, clone IMAGE:3010666, mRNA, 


4.Q4 


447568 


AF155655 


Hs.18885 


CG 1-1 16 protein 


4.04 


452211 


AI985513 


Hs.233420 


ESTs 


4.02 


443292 


AK000213 


H8.9198 


hypothetical protein 


4.01 


420911 


U77413 


Hs.100293 


O-Knked N-acetylgtucosamine (GtcNAc)tr 


4.00 


428738 


NO00380 


Hs.192803 


xeroderma pigmentosum, complementation g 


3.95 


430456 


AA314998 


Hs.241503 


hypothetical protein 


3.95 


437531 


A1400752 


Hs. 112259 


T cell receptor gamma locus 


3.93 


428695 


Al 355647 


Hs. 189999 


purinergic receptor (family A group 5) 


3.91 


410011 


ABQ20641 


Hs.57856 


PFTAIRE protein kinase 1 


3.91 


446494 


AA463276 


Hs.288906 


WW Domain-Containing Gene 


3.91 


409928 


AL137163 


HS57549 


hypothetical protein d J473B4 


3.90 


411596 


BE336654 


Hs.70937 


H3 histone family, member A 


3.90 


425707 


AF1 15402 


Hs.11713 


E74-fike factor 5 (ets domain transcript 


3.90 


451806 


NM.003729 


Hs.27076 


RNA 3'-terminat phosphate cyclase 


3.89 


401045 








339 


437372 


AA323968 


Hs.283631 


hypothetical protein DKFZp547G183 


3.89 


417067 


AJ001417 


Hs.81086 


solute carrier family 22 (extranational 


338 


410467 


AF102546 


Hs.63931 


dachshund (DrosophUa) homolog 


338 


431930 


AB035301 


HS272211 


cadherin7,type2 


338 


453047 


AW023798 


Hs.286025 


ESTs 


3.88 


401785 








3.88 


458229 


A1929602 


Hs.177 


phosphatidyDnosHol glycan, class H 


336 


406414 








336 


412494 


AL1 33900 


Hs.792 


ADP-ribosylation factor domain protein 1 


3.84 


418329 


AW247430 


Hs.84152 


cystathtonlne-beta-synthase 


3.83 


424850 


AA151057 


Hs.153498 


chromosome 18 open reading frame 1 


332 


427585 


D31152 


Hs.179729 


collagen, type X, alpha 1 (Schmld metaph 


3.82 


423052 


M28214 


Hs.123072 


RAB3B f member RAS oncogene family 


3.82 


418111 


AA033813 


Hs.79018 


chromatin assembly factor 1 , subunit A ( 


3.82 


419423 


D26488 


Hs.90315 


KIAA0007 protein 


3.80 


429643 


AA455889 


Hs.1 67279 


FYVE-finger-containing RabS effector 'pro 


3.80 


431499 


NM 001514 


HS.25B561 


general transcription factor IIB 


3.80 


444078 


BE246919 


Hs. 10290 


U5 snRNP-specific 40 kDa protein (hPrp8- 


3.78 


430291 


AV660345 


Hs.238126 


CGI-49 protein 


3.76 


431637 


AI879330 


Hs.265960 


hypothetical protein FU10563 


3.74 


440411 


N30256 


Hs.151093 


hypothetical protein DKFZp434G1415 


3.74 


405917 








3.74 


451230 


BE546208 


Hs.26090 


hypothetical protein FU20272 


3.73 


429597 


NMJJQ3816 


Hs.2442 


a disintegrin and metaOoproteirtase doma 


3.73 


415075 


L27479 


Hs.77889 


Friedreich ataxia region gene X123 


3,72 


440351 


AF030933 


Hs.7179 


RAD1(S.pombe) homolog 


3.70 


443603 


BE502601 


Hs.134289 


ESTs, Weakly similar to KIAA1063 protein 


3.70 


446965 


BE242873 


Hs.16677 


WD repeat domain 15 


3.70 


412350 


A1659306 


Hs.73826 


protein tyrosine phosphatase, non-recept 


3.70 


433852 


AI378329 


Hs.126629 


ESTs 


3.70 


447397 


BE247678 


Hs.18442 


E-1 enzyme 


3.68 


405718 






3.68 
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425217 


AU076696 


Hs.155174 


CDC5 (cell division cycle 5, 8. pombe, h 


3.68 


421734 


AI318624 


Hs.107444 


Homo sapiens cDN A FU20562 fis, clone KA 


3.67 


427221 


L15409 


Hs.174007 


von HippeUJndau syndrome 


3.67 


402408 








3.66 


452946 


X95425 


Hs.31092 


EphA5 


3.66 


419078 


M93119 


Hs.89564 


insuRnoma-assoclated 1 


3.66 


427144 


X95097 


Hs.2126 


vasoactive intestinal peptide receptor 2 


3.65 


423398 


AI382555 


Hs.127950 


bromodomam-containing 1 


3.65 


446320 


AF126245 


Hs.14791 


acyl-Coenzyme A dehydrogenase family, me 


3.63 


404939 








3.62 


403137 








3.60 


437162 


AW005505 


Hs.5464 


thyroid hormone receptor coactrvating pr 


3.60 


404210 








3.59 


443775 


AF291664 


Hs.204732 


matrix metalloprotelnase 26 


3.56 


452501 


AB037791 


Hs.29716 


hypothetical protein RJ10980 


356 


422443 


NAL014707 


Hs.116753 


histone deacetyiase 7B 


355 


420230 


AL034344 


Hs.284186 


forkheadboxCI 


3.55 


418428 


Y12490 


Hs.85092 


thyroid hormone receptor interactor 11 


354 


433002 


AF048730 


Hs.279906 


cyclInTI 


353 


405793 








352 


457940 


AL360159 


Hs.306517 


Homo sapiens TRIpartite motif protein ps 


3.52 


402444 








352 


418250 


U29926 


Ks.83918 


adenosine monophosphate deaminase (Isofo 


351 


414222 


AL135173 


HS.878 


sorbitol dehydrogenase 


351 


422384 


AA224077 


Hs.42438 


Sm protein F 


350 


447805 


AW627932 


Hs.19614 


gemin4 


350 


454265 


H03556 


Hs.300949 


ESTs, Weakly similar to thyroid hormone 


350 


423445 


NMJJ14324 


Hs.128749 


aipha-methylacyi*CoA racemase 


348 


413435 


X51405 


Hs.75360 


carboxypeptldase E 


3.46 


447210 


AF035269 


Hs.17752 


pnosphaUdylserlne-spectfic phospholipas 


ZM 


426931 


NM_003416 


Hs.2076 


zinc finger protein 7 (KOX 4, clone HF.1 


3.45 


408418 


AW963897 


Hs.44743 


KIAA1435 protein 


3.45 


421887 


AW161450 


Hs.109201 


CGI-86 protein 


3.44 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 
PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 
kinase, phosphatase, receptor). The functional domain is indicated for each gene. 

10 Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnlgeneiD: Unlgene number 

Unlgene Title: Unlgene gene title 

PSDomaln: Protein Structural Domain 

15 R1 : Ratio of tumor vs. normal tissue 



Pkey ExAccn UnlgeneiD Unlgene Title PSDomaln R1 

20 426747 AA535210 Hs.171995 kallikretn 3, (prostatB specific antigen trypsin 31-80 

400299 X07730 Hs.171995 kallikretn 3, (prostate specific antigen trypsin 24.91 

420757 X78592 Hs.99915 androgen receptor (dlhydrotestosterone r Androgen_recep ( horrnone_rec,zf-C4 19.72 

403430 S79876 Hs.44926 dlpeptidylpeptidase IV (CD26, adenosine DPPIV_N_term,Peptidase_S9 1628 

430226 BE245562 Hs.2551 adrenergic, beta-2- p receptor, surface 7tmJ 15.40 

25 411096 U80034 Hs.68583 mitochondrial Intermediate peptidase Peptidase_M3 14.81 

440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 7tmJ 12.04 

420381 D50640 Hs.337616 phosphodiesterase 3B, cGMP-tnhibited PDEase 11.10 

407021 U52077 gb:Human marinerl transposase gene, comp SETJransposaseJ 11.02 

401424 arginase 9.58 

30 410001 AB041036 Hs.57771 kaBIkreln 1 1 trypsin 9.03 

428330 L22524 Hs.2256 matrix metalloprotelnase 7 (matrilysin, ■ Peptidase JA10 8.76 

424099 AF071202 Hs.139336 ATP-bfnding cassette, sub-family C (CFTR ABC_tran^BC_membrane 7.64 

419991 AJ000098 Hs.94210 eyes absent (DrosophHa) homolog 1 Hydrolase 720 

431992 NMJXJ2742 Hs.2891 protein kinase C, mu pkinase ) DAGJ J E-bind,PH 6.49 

35 447359 NMJN2093 Hs. 18268 adenylate kinase 5 adenytatekinase 6.00 

400301 X03635 Hs.1657 estrogen receptor 1 OesUecep^f-C4,hormonejec 5.78 

421685 AF189723 Hs.1 06778 ATPase,Ca4+ transporting, type 2C,memb E1-E2JVTPase,Hydrolase 5.37 

444042 NM 004915 Hs.10237 ATP-blndlng cassette, sub-family G (WHIT ABCJran 5.31 

447752 M73700 Hs.105938 lactotransferrin transterrinTtrnJ 529 

40 407945 X69208 Hs.606 ATPase, Cu++ transporting, alpha polypep E1-E2_ATPase,HydroIase,HMA 5.08 

403047 trypsin 4.91 

427617 D42063 Hs.1 99179 RAN binding protein 2 Rarr3P1,zf-RanBP,TPR,proJsornerase 4.88 

422083 NM.001 141 Hs.1 1 1256 arachkJonate 15-lipoxygenase, second typ Rpoxygenase.PLAT 4.82 

449535 W15267 Hs.23672 low density lipoprotein receptor-related WLrecepLb,ldLrecepLa,EQF 4.82 

45 425071 N013989 Hs.154424 dekxJinase, todothyronlne, type II T4_deiodinase 4.32 

423740 Y07701 Hs.293007 aminopeptidase puromycin sensitive Peptidase_M1 424 

424701 NM_005923 Hs.1 51988 mitogen-activated protein kinase kinase pkinase 421 

424085 NM_002914 Hs.139226 replication factor C (activator 1)2 (40 AAA,ViraLhelicase1 420 

417531 NMJJ03157 Hs.1 087 serine/threonine kinase 2 pkinase' 4.12 

50 428695 AI355647 Hs.1 89999 purinergic receptor (family A group 5) 7tm_1 3.91 

410011 AB020641 Hs.57856 PPTAIRE protein kinase 1 pkinase 3.91 

424850 AA151057 Hs,153498 chromosome 18 open reading frame 1 IdLrecepLa 3.82 

412350 AI659306 Hs.73826 protein tyrosine phosphatase, non-recept Y_phosphatase ( Band_41,PDZ 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs.31092 EphA5 EPHJbd,m3,pklnas9,SAM 3.66 

427144 X95097 Hs.2126 vasoactive intestinal peptide receptor 2 7tmJ 3.65 

443775 AF291664 Hs204732 matrix metatloproteinase 26 Peptidase JA10 3.56 

457940 AL360159 Hs.306517 Homo sapiens TRIpartite motif protein ps SPRY,7trn_1 3.52 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (isofo ^deaminase 3.51 

60 413435 X51405 KsJ5360 carboxypeptkfase E Zn_carbOpept 3.46 

447210 AF035269 Hs. 17752 phosphaOdylserine-speciflcphospholipas lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to normal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. The "average" normal prostate level was set to the mean amongst 4 
normal prostate tissues. The "average" prostate cancer level was set to the 85 th percentile 
amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst ail the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

1 5 UnigenelD: Unlgene number 

Unigene Title: Unigene gene title 

R1: Ratio of normal prostate to prostate cancer 



20 



Pkey ExAccn UnigenelD Unigene Title R1 



425932 M81650 Hs.1968 semenogellnl 57.69 

425545 N98529 Hs.158295 Human mRNA for myosin light chain 3 (MLC 19.70 

426752 X69490 Hs.172004 tftin 1525 

442082 R41823 Hs.7413 ESTs; calsyntenin-2 10.05 

25 407245 X90568 Hs.172004 tttfri 9-38 

422711 D60641 Hs.21739 Homo sapiens mRNA; cDNA DKFZp586M518 (f 9.05 

420813 X51501 Hs.99949 prolactin-induced protein 8.18 

411987 AA375975 Hs.183380 "ESTs, Moderately similar to ALU7_HUMAN 7.45 

404567 5 - 62 

30 416030 H15261 Hs.21948 ESTs 5.51 

444892 AI620617 Hs.148565 ESTs 5-27 

444573 AW043590 Hs.225023 ESTs 5-20 

428068 AW016437 Hs.233462 ESTs 5-08 

437440 AAB46804 Hs.123694 ESTs 4.95 

35 404113 4.75 

452279 AA286844 Hs.61260 hypothetical protein RJ 131 64 4.75 

421058 AW297967 Hs.188181 ESTs 4.63 

445592 AV654382 Hs.17847 "ESTs, Weakly similar to K02F3.10[C.e!e 453 

405163 4.49 

40 405227 4.45 

454059 NM_003154Hs37048 statherin 445 

450152 AI138635 Hs.22968 ESTs 4.40 

407013 U35837 •gb:Humannebufln mRNA, partial cds" 4.03 

403612 4.02 

45 440089 AA864468 Hs.135646 ESTs 4.00 

408988 AL1 19844 Hs.49476 Homo sapiens clone TUA8 Cri-du-chat regi 3.98 

436726 AA324975 Hs.128993 "ESTs, Weakly similar to KIAA0465 protel 3.95 

459367 BE148877 ■gb.-CM4-HT0244-111199^)40-h12HT0244Hom 3.95 

427318 AF186081 Hs.175783 zinc transporter 3.92 

50 411782 AW860972 'gb<3Vf>CT0387-180300-167-h07 CT0387 Horn 3.85 

418668 AW407987 Hs.87150 Human clone A9A2BR11 (CAC)n/(GTG)n repea 3.75 

458311 AF069478 ■gb:AF069478 Homo sapiens astrocytoma li 3.61 

403649 3-60 

419682 H13139 Hs.92282 paired-like homeodomain transcription fa 358 

55 412519 AA196241 Hs.73980 "troponin T1, skeletal, slow" 351 

414206 AW276887 Hs.46609 ESTs 3.45 

427419 NM_000200Hs.177888 htstatin 3 337 

420777 AA280223 Hs.130865 ESTs 335 

428134 AM21773 Hs.161008 ESTs 331 

60 450218 R02018 Hs.1 68640 "Ank, mouse, homolog of 3.30 

433474 AI192195 Hs.147174 'EST. Highly similar to ubiquitin-protel 330 

418833 AW974899 Hs.292776 ESTs 3.26 

400440 X83957 Hs.83870 nebulln 3.16 
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413778 AA090235 Hs.75535 'myosin, light polypeptide 2, regulatory 3.06 

423151 AW838068 f gbX)V3lT0048'O1O30O-109-f02 LT0048 Horn 3.05 

445060 AAS30811 Hs.88808 ESTs 2.98 

457065 AI476318 Hs.192480 ESTs 2.95 

5 432456 H00093 "gb:ph8f12u_19/1TV Outward Alu-pnYned tin 2.92 

405678 2.85 

406707 S73840 Hs.931 "myosin, heavy polypeptide 2, skeletal m 2.61 

444105 AW189097 Hs.166597 ESTs 2.78 

433968 AL157518 Hs.90421 PR02463 protein 2.73 

10 438522 AA809431 Hs558B86 ESTs 2.73 

436562 H71937 Hs.1 69756 "complementcomponentl.s subcomponent" 2.68 

412417 AA102268 Hs.42175 ESTs 2.67 

455590 BE072259 "gb:QV4-BT0536-271299-Q59-g04BT0536Hom 2.65 

415380 F07953 Hs.16085 putative G-protein coupled receptor 2.65 

15 428729 AL162331 Hs.181436 hypothetical protein RJ10619 2.64 

408537 AW207734 "ob:UI-H*l2-ag*h-01-0-Ul.s1 NCI_CQAP_S 2.63 

424706 AA741336 Hs.152108 transcriptional unit N143 2.63 

413212 BE072092 'gb:PM4-BT0532-160200-003*11 BT0532 Horn 2.63 

406704 M21665 Hs.929 'myosin, heavy polypeptide 7, cardiac mu 2.62 

20 437507 AA758538 Hs546882 ESTs 2.60 

410384 AI933794 Hs.42745 ESTs 2.58 

408074 R20723 Hs.124764 ESTs 2.58 

436653 AA829828 Hs592402 ESTs 232 

458090 AI282149 HS36213 'ESTs, Highly similar to FXD3_HUMAN FORK 231 

25 432003 A1689154 Hs.122972 ESTs 2.50 

436915 AA737400 Hs.142230 ESTs 230 

410028 AW576454 Hs558553 ESTs 2.46 

448920 AW408009 Hs52580 alkylgtycerone phosphate synthase 2.45 

422046 A1638562 "gb;ts50a10.x1 NCI_CGARJJt1 Homo sapiens 2.44 

30 451122 AA015767 Hs.193587 ESTs 2-40 

422646 H87863 Hs.151380 ESTs 2.36 

451237 AW600293 'gb;EST00049 pGEM-T library Homo sapiens 2.36 

400001 AFFX control: Bfc>B-3 2.36 

415835 Z45365 'fib:HSG2NF081 normalized infant brain cO 2.36 

35 439706 AW872527 Hs.59761 ESTs 2.36 

423341 AW242394 Hs552495 ESTs 2.36 

436486 AA742221 Hs.120633 ESTs 2.35 

407449 AJ002784 gb:Homo sapiens mRNA; fetal brain cDNA 5 2,33 

430573 AA744550 Hs.136345 ESTs 2.32 

40 401974 *31 

443356 AU044498 Ha.133262 "ESTs, Weakly similar to PH0217 reverse 2.31 

430751 NMj012471Hs547868 transient receptor potential channel 5 . 255 

439128 AI949371 Hs.153089 ESTs 255 

448765 R15337 Hs51958 "Homo sapiens cDNA FU10532 fis, clone N 255 

45 451130 AI762250 Hs511347 ESTs 224 

405420 253 

455029 AW851258 , p>:IL3^T0220-16020OO66-H06 CT0220 Horn 253 

438224 AA933999 'gb:on91f04.s1 Soares_NFU_T_GBC_S1 Homo 253 

407764 BE008347 "gb:CMO-BN0154K)80400-325-h04 BN0154 Horn 253 

50 413549 BE252470 'gb:601108292F1 NIH.MGCJ6 Homo sapiens 253 

437010 AA741368 Hs591434 ESTs 253 

435111 AI914279 HS513740 ESTs 252 

403375 2- 21 

455060 AW853441 "gb:RC1-CT0252-030l0r>023^09CT0252Hom 251 

55 409792 AW854153 •gb:RC3^T025«)6040(W)29-d03 CT0254 Horn 2.20 

421154 AA284333 Hs587631 "Homo sapiens cDNA FU14269 fis, clone P 2.19 

401963 2.18 

435034 AF168711 Hs.159397 x 010 protein 2.18 

448996 AW998989 Hs.105749 KIAA0553 protein 2.18 

60 436816 AW297599 Hs555667 ESTs 2.17 

442252 AI733395 Hs.129124 ESTs 2.17 

419310 AA236233 Hs.188716 ESTs 2.16 

418579 H91800 Hs.124156 ESTs 2.16 

423315 R54109 Hs56096 ESTs 2.16 

65 432744 AA988835 Hs.38664 ESTs 2.15 

424492 AI133482 Hs.165210 ESTs 2.15 

424770 AA425562 B gb3w46eQ5.r1 Soares.totaUetus_Hb2HF8 2.15 

437101 AA744518 Hs.120610 ESTs 2,15 

428793 AC004957 Hs598975 "ESTs, Highly similar to collapsln-2-lik 2.15 
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415708 H56475 "gb:yt87d11.r1 Soaresj>tneaLflland_N3HPG 2.13 

459619 2.12 

427506 AK000134 Hs.179100 hypothetical protein FU20127 2.12 

452508 AA804174 Hs.184354 ESTs 2.10 

410881 AWB09157 "gb:RC(KST0118-041099^31-c07J ST0118 Homo sapiens cDNA, mRNA sequence* 2.10 

403087 2.10 

403869 2.10 

445028 D81194 Hs.282499 ESTs 2.10 

447884 H29505 B gb:ym60d10.r1 Scares Infant brain 1NIB Homo sapiens cDNA clone 5', mRNA sequence' 2.10 

414575 H11257 Hs.295233 ESTs 2.09 

420351 BE218221 Hs.190044 ESTs 2.08 

426998 BE274360 "gb:601121Q68F1 NIH.MGCJ0 Homo sapiens cDNA clone 5*, mRNA sequence' 2.08 

405455 2.08 

423843 AA332652 - gb:EST36627 Embryo, 8 week I Homo sapiens cDNA 5' end similar to similar to 

monoamine oxidase B. mRNA sequence" 2.08 

406135 2.07 

427046 BE246180 Hs.121385 ESTs 2.07 

403493 2.05 

444514 AI682905 Hs.270431 "ESTs, Weakly Similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE 

CONTAMINATION WARNING ENTRY [H-sapiens)' 2.05 

435884 AA701443 Hs.192868 ESTs 2,05 

419629 AB020695 Hs.91662 KIAA0888 protein . 2.03 

405900 "~ 2.03 

457350 AW974438 Hs.194136 'ESTs, Moderately similar to AF091457 1 zinc finger protein RIN ZF [Rjiorveglcus]* 2.02 

400007 AFFX control: BioDn-5 2.01 

406978 M64358 'gkHuman mom-3 gene, exon. 1 2.00 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigenelD in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 

BE008347 BE008320 BE083307 BE083311 AW075968 
AW207734 D 50 164 D31150 D8107B D61356 AW996804 
AW854153 AW50G210 BE145772 AW501310 
AW809157 AW812181 AW812175 AW812172 AW812161 AW812165 

AW860972 AW862598 AW862599 AW880988 AW860983 AW86Q898 AW860925 AW880922 AW860986 AW860984 AW860989 
BE072092 BE072106 BE072088 BE072098 BE072103 
BE252470BE147573 
H56475 F29401 F34552 
Z45365 R25905 H05203 T77496 
AI638562 T16929 H13401 F07773 R55836 
AW838068 AW837986 AW838067 AA322487 AW837935 

AA332652 AA331633 AW999369 AW902993 BE170475 AA378845 AW984175 AI475221 
AA425562 AI880208 AA346646 N22655 AW81 1 775 AW81 1786 
BE274360 

H00093 H00079 H00070 HQQ054 H00049 H00063 AW905306 AW905241 AW905410 AW805307 AW905411 AW905240 

AW905352 AW905304 AW905239 AW805242 AW905243 H00087 
AA933999AA781181 

H29505 R18575 Z43580 T48738 AW35454 BE0O4683 
AW600293A1767468 

AW851258 AW851435 AW851106 AW851421 
AW853441 BE145228 BE145218 BE145162 BE145283 
BE072259 BE072230 BE00791 1 
AF069478 AF069479 AF069480 



407764 1014849J 

408537 1064753.1 

409792 1154677J 

410881 1225682 J 

411762 1256906J 

413212 1353792.1 

413549 1375933.2 

415708 1548209.1 

415835 155851 1_1 

422048 210744J 

423151 225415J 

423843 232510J 

424770 243504J 

426998 274259.-1 

432456 347718J 
AW905210 

438224 452656J 

447884 740749J 

451237 863269J 

455029 1249374J 

455060 1251259J 

455590 1335127.1 

458311 543550.1 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Reft Sequence source. The 7 digit numbers in this cotumn are Qenbank Identifier (Gl) numbers. 'Dunham I. et aL" refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al. Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 



Ptey 


ReJ 


Strand 


NLposffion 


401963 


3126783 


Plus 


51382-51521 


401974 


3126777 


Plus 


85330-85683 


403087 


8954241 


Plus 


169511-169795 


403375 


9255944 


Minus 


82554-92795 


403493 


7341425 


Plus 


157568-159084 


403612 


8469060 


Minus 


94723-94659 


403649 


8705159 


Minus 


27141-27247 


403869 


7280046 


Minus 


34379-34583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9986267 


Minus 


161171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405678 


4079670 


Plus 


151821-152027 


405900 


6758795 


Minus 


71181-71535 


406135 


9164918 


Minus 


65489-65715 
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TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATEDTO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 th 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the 10 th percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 
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Pkey: 


Unique Eos probeset Identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unlgene number 




Unlgene TWe: 


Unigene gene title 




R1: 


Ratio of prostate cancer to normal prostate 




Pkey ExAccn 


UnigenelD Unfgene Title 


Hi 


451002 AA013299 


Hs.801 8 ESTs, Weakly similar to ALU3_HUM AN ALU S 


4 CO A AA 

1 684,00 


435596 AA689465 


Hs.188999 ESTs 


738.00 


443576 AI078027 


Hs.169338 ESTs 


246.86 


434247 AA928116 


Hs372065 ESTs 


245.20 


400452 AK000185 


gb:Homo sapiens cONA FU20178 fis, clone 


222.00 


405932 




221.33 


427906 AA864330 


Hs.166520 ESTs 


212.00 


443685 AI686550 


Hs.174481 ESTs 


16350 


fO ISO** rVHIHOOQ 


Hs.193237 ESTs 


149.45 


418323 NM_0Q2118 


Hs.1 1 62 major histocompatibility complex, class 


126.11 


429480 M36860 


Hs3295 elastin (supravalvular aortic stenosis, 




426025 AW138330 


Hs333778 ESTs 


120.00 


418917 X02994 


Hs.1217 adenosine deaminase 


106.75 


404407 




105.71 


442027 A1652926 


Hs.128395 ESTs 


100.53 


433704 AA608684 


Hs.121705 ESTs, Moderately similar to AUICJHUMAN I 


94.00 


453758 U83527 


gb:HSU83527 Human fetal brain (Mlovett) 


89.18 


415354 F06495 


gb:HSC1AB051 normalized infant brain cDN 


87.73 


424239 M67439 


Hs.1 43526 dopamine receptor D5 


8632 


444143 AW747996 


Hs.160999 ESTs 


86.43 


401672 




7756 


430590 AW383947 


H&246381 CD68 antigen 


6847 


411972 BE074959 


gb:PMOBT0582-31010CMX)1-f08 BT0582 Homo 


68.00 


448992 AI766053 


Hs.188346 ESTs 


6156 


408828 BE540279 


gbS01059857F1 NIHJtfGCJQ Homo sapiens c 


57.71 


409653 AW451693 


Hs520B26 ESTs 


" 5640 


402964 




54,67 


422673 N59027 


gb:yv59d1 1 .rl Soares fetal liver spleen 


54.00 


422568 AA372275 


H&279B00 Homo sapiens cDNA FUI1 1383 fis, done HE 


54.00 


438907 R32704 


Hs.301298 ESTs 


52.95 


405172 




52.96 


444897 AW137088 


Hs.144857 ESTs 


5232 


458019 AW592931 


Hs.256298 ESTS 


51.63 


405275 AB028989 


Hs.68500 mitogen-acBvated protein kinase 8 inter 


50.98 


457815 AA703679 


Hs,106999 ESTs, Weakly similar to SYT5„HUMAN SYNAP 


49.60 


424385 AA339666 


gb£ST44776 Fetal brain I Homo sapiens c 


4830 


407172 T54095 


gb:ya92c05.s1 Stratagene placenta (93722 


4738 


428202 AA424163 


Hs.156895 ESTs 


4633 


435672 AI700148 


Hs-283626 ESTs 


4337 


420283 AA485224 


Hs 57734 G protein-coupled receptor tdnase-tntera 


43.00 


417016 AA837098 


Hs.269933 ESTs 


42.70 


438854 AF074994 


Hs.24240 ESTs 


4237 
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406134 42,43 

457319 AA48Q895 Hs-201552 ESTs ( WeaWy similar to T1 7288 hypotheti 42^31 

409314 AA070266 gb:zntf 9d04.r1 Strategene neuroepitheRum 4225 

401124 41.61 

5 429316 AI371157 Hs.178538 ESTs 40X0 

420317 AB006628 Hs.96485 KIAAQ290 protein 39X4 

457586 AW062439 gbJ/Rr>CT006O1 20899-001 -f08 CT0060 Homo 39.60 

417407 AA923278 Hs280905 ESTs, Weakly similar to protease [H.sapi 38.73 

430269 BE221682 H 3. 178364 ESTs 38X6 

10 439602 W79114 HsX8558 ESTs 36X9 

433686 AA604799 Hs.136528 ESTs, Moderately similar to ALU1JWMAN A 3629 

417993 AW963705 Hs295806 ESTs, Weakly similar to ALU7_HUMAN ALU S 36.18 

428214 AA936282 Hs.120397 ESTs 36.10 

416908 AA333990 Ms.80424 coagulation factor XIII, A1 polypeptide 36.08 

15 426264 BE314852 Hs.168694 hypothetical protein FU10257 36.00 

415911 H08796 Hs.124952 ESTs 36.00 

457502 AA076049 Hs274415 Homo sapiens cONA RJ1Q229 fis, clone HE 3523 

421566 NM.000399 Hs.1395 early growth response 2 (Krox-20 (Orosop 3520 

401468 34X9 

20 458561 AI220150 Hs211195 ESTs 34X0 

433601 8E350738 Hs.123993 ESTs, Weakly similar to T00366 hypotheti 3324 

454977 AW848032 gb*JLfrCTa214-231299-053-D11 CT0214Homo 32.96 

402828 32,93 

414522 AW518944 Hs.76325 Homo sapiens cDNA: FU23125 fis, clone L 31J6 

25 402842 31X8 

421245 AA285383 gb:HTH280 HTCDLl Homo sapiens cDNA 573? 31X9 

401631 F05183 Hs.1799 CD1D antigen, d polypeptide 3126 

408057 AW139565 gb:UI-H-BI1-aea-d-04-G*Ul.s1 NCI_CGAP_Su 3124 

408069 H81795 gbrys68a10/1 Soares retina N2b4HR Homo 3120 

30 438694 T87479 Hs291797 ESTs 31.09 

449156 AF103907 Hs.171353 prostate cancer antigen 3 29.78 

428796 AU076734 Hs.193665 solute carrier family 28 (sodium-coupled 29.76 

452549 AI907039 gb:PM-BT1 34-020499-566 BT134 Homo sapien 29X9 

410129 BE244074 Hs285531 regulator of Fas-induced apoptosis 29X3 

35 414464 AI870175 Hs.13957 ESTs 29.47 

412326 R07566 Hs.73817 Small inducible cytokine A3 (homologous 2922 

459081 W07808 gb:zb03a12.r1 SoaresJetaUung_NbHL19W 2920 

448702 AW102670 Hs.122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Drosophila) homolog 2 28.74 

40 443412 W84893 Hs.9305 angiotensin receptor-fike 1 28.61 

457324 AB028990 Hs243901 KIAA1067 protein 2824 

424247 X14008 Hs234734 lysozyme (renal amyloidosis) 28.18 

457140 AI279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 alpha-rrathylacyl-CoA racemase 28.06 

45 457669 AW104257 Hs.123426 ESTs, Weakly similar to putative serine/ 27X1 

412429 AV650262 Hs.75765 GR02 oncogene 27.36 

405495 27.33 

406516 2725 

407997 AW135429 Hs243577 ESTs 26.98 

50 442115 AW452332 Hs257554 ESTs 26X6 

409038 T97490 HsX0002 small inducible cytokine subfamDy A (Cy 26X4 

402838 26X2 

449846 AI979284 Hs200552 ESTs " 2621 

417153 X57010 HsX1343 cofiagen, type ll f alpha 1 (primary oste 2620 

55 439792 NM_014856 HsX684 KIAA0476 gene product 25X1 

450096 AI682088 Hs223368 ESTs 25X0 

424186 AL133660 Hs.142926 Homo sapiens mRNA;cDNADKFZp434M0927(f 25X7 

414246 BE391090 Hs280278 EST 25X7 

420848 NM_0051B8 Hs.99980 Cas-Br-M (murine) ecotroplc retroviral t 25.48 

60 424778 AA251048 Hs.153042 lymphocyte antigen 9 25.42 

409126 AA063426 gbzf70c08.s1 Soaresj)ineaLgiandJI3HPG 2525 

443936 AW083491 HsX1196 ESTs 2522 

419392 W28573 gbSIMO Human retina cONA randomly prim 25X1 

411201 T74588 HsX509 ESTs, Weakly similar to C03_HUMAN COMPLE 24X5 

65 422940 BE077458 gb:RC1-BT06()6^90500<)15-b04BT0606Homo 24.76 

437571 AA760894 Hs.153023 ESTs 24.74 

433973 AI014723 Hs.131770 ESTs 24X7 

422416 BE019557 Hs.11900 Human DNA sequence from clone RP4-583P15 24X3 

421552 AF026692 Hs.105700 secreted frlzzled-rebted protein 4 2449 
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25 



30 



35 



40 



45 



50 



55 



60 



65 



443668 U25758 
424800 AL035588 
453S33 AA357001 
430565 AL122081 
433694 AI208611 
451045 AA215672 
408583 AW448674 
444040 AF204231 
414182 AA136301 
418678 NMJXJ1327 
408380 AF123050 
456076 BE243877 
418299 AA279530 
444917 R68651 
444381 BE387335 
415788 AW628686 
410896 AW809637 
412978 AI431708 
458418 AV653846 
454791 BE071874 
408748 J05500 
416011 H14487 
440474 AI207936 
447047 A16236S8 
426793 X89887 
409841 AW502139 
405685 

457359 AI983207 
423067 AA321355 
422355 AW4Q3724 
401201 

458278 W28912 
439097 H66948 
414875 H42679 
400926 

451355 NM.004197 
446982 AW500221 
417105 X60992 
405777 

424123 AW966158 
425009 X58288 
443271 BE558568 
421064 AI245432 
418819 AA228776 
457595 AA584854 
404426 

412571 U43143 
431457 NM.012211 
414002 NW.006732 
418994 AA296520 
437158 AW090198 
437886 AA156781 
417421 AL138201 
433057 X15675 
421730 AW449808 
456557 AA284477 
440806 AI247422 
439845 AL355743 
416155 AI807264 
437820 AA769062 
450923 AW043951 
418329 AW247430 
424537 AI673027 
447742 AF1 13925 
415251 R42863 
440770 AA912815 
407711 AI035846 
427157 U51166 
409847 AW501751 



Hs.134584 

HS.153203 

Hs.34045 

Hs244343 

Hs.12066 

Hs.47359 
Hs.182982 

Hs.167379 

Hs.44532 

Hs.76941 

Hs.83968 

Hs.144997 

HS283713 

Hs.78851 

Hs.820 
Ks.126261 

Hs.47431 

Hs.7195 

Hs.246306 

Hs.172350 



ESTs 

MyoD family Inhibitor 
hypothetical protein RJ20764 
cadherin related 23 

Homo sapiens cONA FU1 1720 fis, clone HE 
gb:zr96e09.s1 NCI_CGAP_GCB1 Homo sapiens 
ESTs 



gbzk93g04.s1 Soares_pregnanLuterus_NbH 



dtublqultin 

ATPase, NaWK+ transporting, beta 3 poly 
integrin, beta 2 (antigen CD1 8 (p95), ly 
ESTs 
ESTs 

K1AA0217 protein 

gb;MR4-ST0124-281099-015-b07 ST0124 Homo 
homeoboxC6 

Homo sapiens Chromosome 16 BAC clone CIT 
gb:RC2~BT0522-12020O414-a06 BT0522 Homo 
spectrin, beta, erythrocytic (includes s 
gb:ym18c10.r1 Soares Infant brain 1NIB H 
gamma-arninobutyric acid (GABA) A recepto 
Homo sapiens cDNA: FU23529 fis, clone L 
HIR (htstone ceil cycle regulation defec 
gb:UI-HF-BR0p^r-e-O5<HJI.r1 NIHJ/IGC_5 



Hs.192481 ESTs, Weakfy slmSar to SYPHJIUMAN SYNAP 
Hs285401 ESTs 

Hs.1 40 immunoglobulin heavy constant gamma 3 (G 

Hs.129019 ESTs 

gb:yr86d10/1 Soares fetal liver spleen 
Hs.77522 major histocompatibility complex, class 

Hs.444 serine/threonine kinase 19 

Hs.43616 Homo sapiens mRNA for FU00029 protein, 

Hs.81226 CD6 antigen 

Hs58582 Homo sapiens cDNA FIJI 2702 fis, done NT 
Hs. 1541 51 protein tyrosine phosphatase, receptor t 
Hs.195704 ESTs 

Hs.101382 tumor necrosis factor, alpha-induced pro 
Hs.191721 ESTs 

gb:no09h11.s1 NCLCGAP_Phe1 Homo sapiens 

Hs.74049 tons-related tyrosine kinase 4 

Hs256297 integrin, alpha 11 

Hs.75678 FBJ murine osteosarcoma viral oncogene h 

Hs.89548 Selectin E (endothelial adhesion motecul 

Hs.4779 KIAA1 150 protein 

Hs.83992 ESTs 

Hs.82120 nuclear receptor subfamily 4, group A, m 
Hs298832 Human pTR7 mRNA for repetitive sequence 
Hs.164036 glucosamine (N-acetyl)^-suifatase (Sanf 
Hs.96618 ESTs 
Hs.129966 ESTs 

Hs56663 Homo sapiens EST from clone 41214, full 
Hs205442 ESTs, Weakly similar to AF1 1761 01 inner 
Hs.16029 ESTs, Weakly similar to alternatively sp 
Hs.38449 ESTs 

Hs.841 52 cystathionine-beta-synthase 
Hs.143271 ESTs 

Hs. 19405 caspase recruitment domain 4 
HS.7124 ESTs 
HS222078 ESTs 
Hs25522 ESTs 

Hs.173824 thymlne-DNA glycosylase 
Hs279733 ESTs 
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24.49 

24.10 

24.04 

24.00 

23.69 

23.83 

23.73 

23.62 

23.39 

2320 

22J68 

2255 

22,38 

2226 

22.08 

22.04 

22.00 

21.95 

21.94 

2134 

2126 

2124 

21.14 

21.11 
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2030 
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1852 

18.47 

1840 
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1828 
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417240 
435732 



N57568 

AF229178 

AW977385 



432485 
429490 
429984 
449214 
433857 
431735 
401515 
444045 
442754 
426559 
432415 
427829 
432516 
435259 



444880 
417651 
453457 
424246 
419078 
417696 
431117 
455254 
425782 
426678 
426403 
425905 
438867 
420940 
459234 
404756 
422247 
420568 
443559 
438703 
411424 



422538 
447108 
448520 
438567 
407811 
410721 
437133 
408182 
417315 
431840 



418277 
410688 
420120 
429597 
447033 
421684 
408599 
446012 
409671 
405934 
426108 
418208 
410708 
447342 
454563 
411507 
438170 
416292 



AI971131 

AL050102 

AI889114 

AK000596 

AW977724 

AI097439 

AL045825 

AB001914 

T16971 

AI188225 

R08003 

M152106 

T81668 

AW1 18683 

R06874 

AL037103 

AW452533 

M93119 

BE241624 

AF003522 

AW877015 

U66468 

H08170 

NNL000361 

AB032959 

AW451157 

AA830664 

A1940425 

U18244 

F09247 

AI076765 

AI803373 

AW845985 

NM.006441 
AW449602 
AB002367 
AW451955 
AW1 90802 
R23534 
AB018319 
AA047854 
A1080042 
AA534908 
AA847856 
AW135221 
AW796342 
AL049610 
NML003816 
A1357412 
BE281591 
AA055800 
AV656098 
AA076769 

AA622037 
AW291168 
AA534370 
A1199268 
AW807530 
AW850140 
AI916685 
AA179233 



Hs.176028 EST 

Hs. 123 136 leucine rich repeat and death domain con 
HS378615 ESTs 

Hs.276770 CDW52 antigen (CAMPATH-1 antigen) 

Hs.293684 ESTs, Weakly similar to alternatively sp 

Hs.227209 DKFZP586F1019 protein 

Hs.195663 ESTs 

Hs.3618 hlppocatein-like 1 

Ks.75968 thymosin, beta 4, X chromosome 



Hs.135548 ESTs 
Hs.210197 ESTs 

Hs.170414 paired basic amino acid cleaving system 
Hs.289014 ESTs 
Hs.127482 ESTs 
Hs.188013 ESTs 
Ks.4859 cydinLania-6a 

gb:yd29c04.r1 Scares fetal fiver spleen 
Hs.154150 ESTs 
Hs.268628 ESTs 

Hs.270599 ESTs, Weakly similar to unnamed protein 
Hs.143604 Kaiso 

Hs.89584 insulinoma-associated 1 

Hs.82401 CD69 antigen (p60 t early T-cetl actrvati 

Hs.250500 delta (Drosophfla)-like 1 

gb:QV2-PT0010-25030(H)96-f12 PT0010 Homo 
Hs.159525 cell growth regulatory with EF-handdoma 
Hs.113755 ESTs 
Hs.2030 thrombomodulin 
Hs.161700 KIAA1 133 protein 
Hs.181157 ESTs 
Hs.143974 ESTs 

gb:CMO~CTOO52-15O799-024KiO4 CT0052 Homo 

Hs.1 13602 solute carrier family 1 (high affinity a 
Hs.167399 protocadherlnalphaS 
HS269899 ESTs 
Hs.31599 ESTs 

gb:RC2-CT0163-200999-002-H08 CT01 63 Homo 



Hs.1 18131 5, 

Hs217953 ESTs, Moderately similar to NK-TUMORREC 
Hs21355 doubtecortlnandCaMkinase-ltkel 
H&153065 ESTs 

Hs.40098 cysteine knot superfamfly 1 , BMP antagon 
Hs3730 heterogeneous nuclear ribonucleoprotein 
Hs.5460 KIAA0776 protein 

gb:zf49g04.r1 Scares retina N2b4HR Homo 
Hs, 180450 ribosomal protein S24 
Hs2860 POU domain, class 5, transcription facto 
HS.124565 ESTs 
Hs.130812 ESTs 

gb:PM2-UM^7-230200-002-h02 UM0027 Homo 
Hs.95243 transcription elongation factor A (Sll)- 
Hs.2442 a disintegrin and metailoproteinasd doma 
Hs,157601 EST-notlnUnlGene 
Hs.106768 hypothetical protein FU1051 1 
HS222933 ESTs 

Hs.172382 hypothetical protein BJ20001 

gb:7BQ2B10 Chromosome 7 Fetal Brain cDNA 

Hs.1 66468 programmed cell death 5 
Hs.41295 ESTs 

Hs.154088 Homo sapiens cDNA: FU22756 fis, clone K 

Hs.19322 ESTs; Weakly similar to 111! ALU SUBFAMI 

gb:CMr>STOQ81-130999-054402 ST0081 Homo 
gb:IL3-CTG219-261099-023-D11 CT0219 Homo 

Hs.194601 ESTs 

Hs.42390 
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406638 M13861 gfcHuman T-cell receptor active beta-cha 1526 

446686 AW138043 Hs.156307 ESTs 1525 

434485 A1623511 Hs. 118567 ESTs 1524 

441188 AW292330 Hs255609 ESTs 1522 

5 444172 BE147740 Hs.104558 ESTs 1522 

409521 BE244854 Hs. 159578 Homo sapiens mRNA for FU00020 protein, 15.16 

420748 AA279956 Hs.88672 ESTs 15.14 

422583 AA410506 H s. 118576 H .sapiens mRNA for ribosomal protein L18 15.14 

424240 AB023185 Hs. 143535 ra^unVcalrrodutlrHJependent protein kin 15.12 

10 451118 AJ662098 Hs.60640 ESTs 15.12 

437495 BE177778 gbflC1-HT0598*31030(H)12-f07HT0598Homo 15.12 

445467 AI239832 Hs.15617 ESTs, Weakly similar to ALU4_HUMAN ALU S 15D6 

418305 AW006783 Hs.6586 ESTs 15.03 

402812 15.02 

IS 436851 AA732480 Hs293581 ESTs 15.00 

400991 15.00 

415752 BE314524 Hs.78776 Human putative transmembrane protein (nm 14.96 

429900 AA460421 Hs.30875 ESTs 14.90 

403683 14.B4 

20 430315 NMJXM293 Hs239147 guanine deaminase 14.80 

451952 AL120173 HS301663 ESTs 14.72 

424687 J05070 Hs.151738 matrix metalioproteinase 9 (gelatinaseB 14.69 

447229 BE617135 gb:601441677F1 NIH_MGC_65 Homo sapiens c 14.67 

425818 AB021225 Hs.159581 matrix metalioproteinase 17 (membrane-In 14.65 

25 448553 AI638449 Hs.173031 ESTs 14.63 

431089 BE041395 Hs283676 ESTs, Weakly similar to unknown protein 14.60 

459145 A1903354 gb:RC-BT029-100199-117 BT029 Homo sapien 14.55 

449650 AF055575 Hs297647 ESTs, Moderately similar to calcium chan 1454 

400952 14.46 

30 445885 A1734009 Hs.127699 EST cluster (not in UniQene) 14.44 

407938 AA905097 Hs.85050 phospholamban 14.42 

431676 AI685464 Hs292638 ESTs 14.40 

437210 AA311443 Hs293563 Homo sapiens mRNA; cDNA DKFZp5B6E2317 (f 1436 

451900 AB023199 HS27207 K1AA0982 protein 1458 

35 445800 AA126419 Hs.301632 ESTs 1432 

412368 AW945992 Hs. 181 125 immunoglobulin lambda locus 1431 

40S055 AW304Q28 Hs.300578 ESTs 1423 

408763 W57550 Hs.301526 Homo sapiens cONA FU13181 fts, clone NT 1422 

446734 AL049278 Hs. 16074 Homo sapiens mRNA; cONA DKFZp564l153 (fr 1422 

40 413551 BE242639 Hs.75425 ubiquftin associated protein 1422 

421913 AI934365 Hs.109439 osteoglycin (osteoinductive factor, mime 1422 

452712 AW838616 gb«(^T0054-1402(XH)13*D01 LT0054 Homo 1422 

451468 AW503398 Hs210047 ESTs 14.16 

406038 Y14443 Hs.88219 zinc finger protein 200 14.14 

45 424909 S78187 Hs.153752 cell division cycle 25B 14.07 

434078 AW880709 HS283683 EST 14.07 

415254 AI815831 Hs.184378 ESTs 14.05 

418196 AI745649 Hs26549 ESTs.WeaJdysimBartoTOOOeehypotheti 14.02 

410020 T86315 Hs.728 riboiuidease, RNase Afamity, 2 (Over, 13.98 

50 411352 NMJXJ2890 Hs.758 HAS p21 protein activator (GTPase actlva 13.98 

429848 AF145439 Hs225946 chemoWne (G-C motif) receptor 9 13.95 

413729 BE159999 gb:QV1 -HT041 2-270300-1 23-d10HT041 2 Homo 13.90 

400125 ' 1338 

420319 AW406289 Hs.96593 hypothetical protein 13.85 

55 448272 A1479094 Hs.170786 ESTs 13.80 

422695 AA315158 gb:EST186956 HCC cell line (matastasls t 13.80 

424565 AW102723 Hs.75295 guanylate cyclase 1, soluble, alpha 3 1378 

458048 H30340 Ks.173705 Homo sapiens cONA: FU2205O (Is, clone H 13.78 

408894 AI935400 Hs217286 ESTs 13.76 

60 454093 AW860158 gb:RCOCT0379-29010CH)32-b04CT0379Homo 13.75 

410869 X91662 Hs.66744 twist (Drosophila) homolog (acrocephalos 13.74 

457751 AI908236 gb:lL-BT166-180399-010 BT166 Homo sapien 13.72 

455131 AW857913 gb: RCO-CT0323-231 19 9^)3 1-bO5CT0323 Homo 13.69 

408364 AW015238 Hs.128453 ESTs 13.67 

65 425907 AA365752 Hs. 155955 ESTs 13.62 

402359 13.60 

401044 1333 

409877 AW502498 Hs.157150 ESTs, WeakJy similar to zinc finger prot 1333 

423690 AA329648 Hs23804 ESTs 13.49 
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430685 AI69Q234" Hs.191666 ESTs, Weakly similar to reverse transcri 13.47 

414052 AW578849 Hs283552 ESTs, Weakly similar to unnamed protein 1346 

447858 AW080339 Hs211911 ESTs 1344 

435716 AI573283 Hs.38458 ESTs 1344 

5 439120 H58389 gb:yt87c03.r1 Soares_pineaLgtandJJ3HPG 13.43 

402768 13.40 

451591 AA886446 Hs.146278 ESTs 13.40 

405411 13.38 

426558 AW188574 Hs24218 ESTs 1334 

10 453506 AA132818 Hs.1 10407 ESTs, Weakly similar to coded for by C. 1333 

416445 AL043004 Hs.300678 Human serine/lhreonlne kinase mRNA, part 1332 

457084 AI074149 Hs. 150905 ESTs, Weakly similar to chondroitin 4-su 1332 

403838 1332 

427337 Z46223 Hs.1 76663 Fc fragment of IgG, low affinity llib, r 1330 

15 434318 AW207552 Hs.1 16328 ESTs, Weakly similar to (U134E15.1 [H.sa 1328 

435193 N41359 HS218107 ESTs 1328 

414758 AW451101 Hs.159489 ESTs, Moderately similar to hexokinasa I 1327 

420626 AF043722 Hs.99491 RAS guanyl releasing protein 2 (calcium 1326 

420052 AA418850 Hs.44410 ESTs 1325 

20 414020 NM_002884 Hs.75703 small Inducible cytokine A4 (homologous 1325 

403851 1324 

422647 W07492 Hs.157101 ESTs 1321 

433598 AI762836 Hs271433 ESTs, Moderately similar to ALU2 J1UMAN A 1321 

409065 AB033113 HS30187 KIAA1237 protein 1320 

25 435063 R2196S Hs37734 G protein-coupled receptor kinase-lntera 13.19 

439367 BE386844 Hs248746 ESTs 13.17 

451957 AI796320 Hs.10299 Homo sapiens cDNA FU13545 fis, clone PL 13.16 

420569 AA278362 Hs269062 Homo sapiens cDNA FU12334 fis, clone MA 13.14 

447883 BE262802 Hs.4909 dlckkopf (Xenopus laevls) homolog 3 13.07 

30 426490 NMJW1621 Hs.1 70087 aryt hydrocarbon receptor 1336 

414769 AA155859 Hs.79708 ESTs 13.05 

451418 BE387790 Hs26369 ESTs 13.04 

443494 T89719 Hs270404 Homo sapiens cDNA: FU22389 fis, clone H 13.03 

425878 AW964806 Hs.38085 ESTs, Weakly similar to putative glycine 13.02 

35 431912 AI660552 Hs.1 54903 ESTs, Weakly similar to A56154 Abl subst 13.00 

407122 H20276 Hs.31742 ESTs 13,00 

456491 AU37468 Hs.97277 Homo sapiens mRNA; cDNA DKFZp434H1322 (f 12.99 

448172 N75276 Hs.135904 ESTs 12.98 

452144 AA032197 Hs.1 02558 ESTs 12.96 

40 419953 BE267154 Hs.125752 ESTs 12.98 

416182 NM..004354 Hs.79069 cyclinG2 12.94 

451154 AA015879 Hs33536 ESTs 1293 

412257 AW903830 gbCM4-NN1037-250400-155-h04NN1 037 Homo 12.93 

449784 AW161319 Hs.12915 ESTs 12-92 

45 432695 D63480 Hs278634 KIAA0146 protein 1232 

454105 NM-001259 Hs38481 cydirvdependent kinase 6 1252 

439093 AA534163 Hs5476 serine protease inhibitor, Kazal type, 5 12.90 

416098 H41324 Hs31581 ESTs, Moderately similar to ST1B.HUMANS 12.88 

424897 D63216 Hs.1 53684 taled-retated protein 12.88 

50 414604 AU076649 Hs.76556 growth anBst and DNA-darnage-iroiucible 3 1238 

414664 AA587775 Hs.66295 Homo sapiens HSPC31 1 mRNA, partial cds 12.84 

452560 BE077084 gbflC5-BT0603-2202(XM)13<X)7BT0603Homo 1234 

413869 NM.000878 Hs.75596 interieukin 2 receptor, beta - 12.80 

452359 BE167229 Hs29206 Homo sapiens clone 24659 mRNA sequence 1230 

55 435886 BE265839 Hs. 12126 hepatocellular carcinoma-associated antl 12.78 

445230 U97018 Hs.12451 echinoderm microtubute-associated protei 12.78 

412226 W26786 gb:15d7 Human retina cONA randomly prime 12.77 

446619 AU076643 Hs.313 secreted phosphoprotetn 1 (osteopontin, 12.76 

447769 AW8737C4 Hs.48764 ESTs 12.76 

60 414478 AI306389 Hs.76240 adenylate kinase 1 12.76 

425383 D83407 Hs. 156007 Down syndrome critical region gene 1-ltk 12.68 

450704 H85157 Hs.40696 ESTs 1236 

405856 12.66 

412935 BE267045 Hs.75064 tubuHn-spedfic chaperone c 12.65 

65 402802 12.62 

452588 AA889120 Hs.1 10637 HomeoboxAlO 1232 

419978 NM..001454 Hs.93974 forkheadboxJI 12.62 

403137 12.60 

430226 BE245562 Hs2551 adrenergic, beta-2- ( receptor, surface 1237 
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448076 AJ133123 Hs20198 adenylate cyclase 9 1256 

450462 F07097 Hs.300828 Homo sapiens mRNA full length Insert cDN 1254 

405236 1252 

409292 AA071O51 gb;zm58e05.s1 Stratagene fibroblast (937 12.47 

5 421540 AA767669 Hs.10242 ESTs 12.47 

425840 AW978731 Hs. 30 1824 ESTs 12.44 

443181 AI039201 Hs.54548 ESTs 12.42 

452436 BE077546 Hs.31447 ESTs 12.42 

455183 AW984111 gb:RCO-HN0OO7-16030(H)11-(09HN0007Homo 12.40 

10 432887 AI926047 Hs.162859 ESTs 1237 

410494 M36564 Hs.64016 protein S (alpha) 12.36 

439024 R96696 Hs.35598 ESTs 1236 

451246 AW189232 Hs.39140 cutaneous T-cell lymphoma tumor antigen 1236 

432892 AL042615 Hs. 15995 ESTs 1235 

IS 418982 AI348838 Hs. 13073 ESTs 1235 

414516 AI307802 Hs279551 ESTs 1234 

440134 BE410734 gb;601301619F1 NIK.MGC 21 Homo sapiens c 1229 

443873 AL048542 Hs.16291 ESTs 1228 

401286 1226 

20 454020 AW962845 Hs256527 ESTs 1224 

420077 AW512260 Hs37767 ESTs 1224 

443837 AI984625 Hs.9884 spindle pole body protein 1224 

407519 X64979 gb:H.sapiens mRNA HTPCRX01 for olfactory 1223 

435839 AF249744 Hs25951 Rho guanine nucleotide exchange factor ( 1222 

25 448552 AW973653 Hs20104 hypothetical protein FU0OQ52 1220 

405325 1220 

451009 AA013140 Hs.1 15707 ESTs 12.18 

423066 Y18264 Hs.120171 ESTs 12.17 

439556 AI623752 Hs. 163603 ESTs 12.16 

30 443062 N77999 Hs.8963 Homo sapiens mRNA full length Insert cON 12.15 

445873 AA250970 Hs.251946 Homo sapiens cDNA: RJ231 07 fis, clone L 12.14 

453542 AW836724 Hs.33190 Homo sapiens mRNA expressed only In plac 12.11 

440106 AA864968 Hs. 127699 ESTs 12.10 

417605 AF006609 Hs.82294 regulator of Q-protein signalling 3 12.10 

35 440268 U29589 Hs.7138 cholinergic receptor, muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 12.02 

458727 AIG22813 Hs.92679 Homo sapiens clone CDABP0014 mRNA sequen 11.96 

445407 AI222658 Hs221869 ESTs, Weakly similar to la costa [D.mela 11.95 

418250 U29926 Hs.83918 adenosine monophosphate deaminase (Isofo 11.94 

40 414129 A1990287 Hs270798 ESTs 11.93 

409799 011928 Ks.76845 phosphoserine phosphatase-Iike 11.92 

438461 AW075485 Hs286049 phosphoserine aminotransferase 11.92 

443912 R37257 Hs. 184780 ESTs 11.92 

424606 AA343936 gb:EST49786 Gall bladder I Homo sapiens 11.90 

45 434217 AW014795 H&23349 ESTs 1130 

451533 NMJXJ4657 Hs26530 serum deprivation response (phosphatidyl 11.90 

422423 AF2B3777 Hs.1 16481 CD72 antigen 1139 

409398 AW386461 fib :PM4-PT0019-121299W-F02 PT0019 Homo 1 1 39 

423853 AB011537 Hs.1 33466 s&t (DrosophJIa) homolog 1 1152 

50 446180 AI074413 Hs. 14220 hypothetical protein FU20450 1130 

414341 D80004 Hs.75909 KIAA01 82 protein 1130 

406538 11.79 

433253 AW450502 Hs24218 ESTs - 11.79 

447397 BE247676 Hs.18442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 CDA14 11.76 

416882 R23765 Hs23575 ESTs 11.74 

425770 NM.014363 Hs, 159492 spastic ataxia of Chartevoix-Saguenay (s 11.72 

428826 AL048842 Hs.1 940 19 attracdn 11.72 

433037 NM.014158 H3279938 HSPC067 protein 11.72 

60 447476 BE293466 Hs20880 ESTs 11.72 

452092 BE245374 Hs27842 hypothetical protein FIJ1 1210 11.72 

412922 M60721 Hs.74870 H2.0 (DrosophBa)-IIke homeo box 1 11.72 

401680 NM_005578 Hs.1 80398 LIM domain-containing preferred transloc 1159 

422576 BE548555 Hs.1 18554 CGI-83 protein 1158 

65 450203 AF097994 H&301528 l-kynurenine/aipha-aminoadipateaminotra 11.68 

410531 AW752953 gb:QVO<5TO224-261099-O35-g02 CT0224 Homo 1137 

425917 W28517 Hs.117167 Homo sapiens cONA: FU23087 fis, clone L 1156 

418693 AI750878 Hs37409 wrornbospondin 1 11.64 

400557 1152 
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416188 BE157260 Hs.79070 v-myc avian mystocytomatosis vtral oncog 11.60 

418047 AW952771 Hs.90043 ESTs 11.59 

420441 A1986160 Hs.88446 ESTs 1U9 

400885 11-57 

5 409853 AW502327 Qb:UmF-BR0p^ka-a-07-(HJI.r1 NIHJJIGCJ 1156 

400802 11^6 

434540 NM.016045 Hs5184 TH1 drosophila homotog 11-55 

431449 M55994 Hs256278 tumor necrosis factor receptor superfaml 1155 

425928 S55736 Hs238852 ESTs, Weakly similar to hypothetical pro 1154 

10 434701 M460479 Hs.4096 KIAA0742 protein 1153 

434228 Z42047 Hs.283978 ESTs; KIAA0738 gene product 1152 

420729 AW964897 Hs290825 ESTs -1152 

428328 AA426080 Hs.98489 ESTs 1150 

433887 AW204232 Hs279522 ESTs 1150 

15 414812 X72755 Hs.77367 monokine Induced by gamma Interferon 11-46 

457718 F1B572 HsWB ESTs 11.44 

452260 AA453208 Hs28726 RAB9, member RAS oncogene family 11.42 

459029 AA131376 Hs285203 fibroblast growth factor 12 1142 

456267 AI127958 Hs.83393 cystatinE/M 11.39 

20 433285 AW975944 Hs237396 ESTs 1138 

449188 AW291876 Hs.196986 ESTs 11.37 

447861 AI434593 Hs.164294 ESTs 11.37 

456023 R00028 gb:ye70a06.s1 Soares fetal liver spleen 1136 

439444 AI277652 Hs54578 ESTs 11.31 

25 401163 11.31 

430886 L36149 Hs-243116 chemokine {C motif) XC receptor 1 11-28 

450784 AW246803 H$.47289 ESTs 11.28 

452391 AL044829 H$29331 carnitine palmitoyltransferase I, muscle 1127 

449625 NMJH4253 Hs23796 odz (odd Oz/ten-m, Drosophila) homolog 1 1156 

30 456827 AA075687 Hs.147176 epidermal growth factor receptor substra 1124 

439328 W07411 Hs.1 18212 ESTs, Moderately similar to AW3_HUMAN A 1124 

432093 H28383 gb:y!52c03.r1 Soares breast 3NbHBst Homo 1124 

407335 AA631047 Hs.158761 Homo sapiens cDNA FU13054 fis, clone NT 1123 

442501 AA315267 Hs23128 ESTs 1122 

35 429746 AJ237672 Hs214142 5,1 0-methyienetetrahydrofoiate reductase 1121 

422858 R35398 gb^g64g10.r1 Soares infant brain 1 NEB H 1120 

415156 X84908 Hs.76060 phosphoryiase kinase, beta 1120 

446713 AV660122 Hs282675 ESTs 1120 

452221 C21322 Hs.11577 ESTs 1120 

40 418261 W78902 Hs293297 ESTs 11-17 

433332 AI367347 Hs.127809 ESTs 11.16 

434539 AW748078 Hs214410 ESTs 11.16 

413471 BE142098 gb:CM4-HT0137-220999-017-d11 HT0137Homo 11.14 

410037 AB020725 HS58009 KIAA0918 protein 11.14 

45 405601 11*13 

458332 A1000341 Hs220491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphatide acid phosphatase type 2B 11.10 

431475 AI567669 Hs287316 ESTs 11.10 

50 425710 AF030880 Hs.159275 solute carrier family, member 4 1158 

413748 AW104057 Hs.19193 ESTs 11-07 

409208 Y00093 Hs51077 integrin, alpha X (antigen CD11C(p150), 11.07 

457278 W92745 Hs.193324 ESTs - 11.03 

407021 U52077 gb:Human marinerl transposase gene, comp 11.02 

55 445701 AF055581 Hs.13131 lymphocyte adaptor protein 11.02 

408338 AW867079 gb^R1-SN0033-12040(H)02-c10SN0033Homo 1055 

401030 BE382701 Hs25960 v-myc avian myelocytomatosls viral relat 1055 

437891 AW006969 Hs.6311 hypothetical protein FU20859 1054 

453874 AW591783 Hs.36131 collagen, type XIV, alpha 1 (undulin) 1054 

60 421562 AA530994 Hs.105803 ghrelin precursor 10.92 

413431 AW246428 Hs.75355 ubfquttin-conjugating enzyme E2N (homolo 1052 

400132 1052 

436420 AA443966 Hs51595 ESTs 1050 

424880 NM_000328 Hs.153614 retinitis pigmentosa GTPase regulator 10.88 

65 433264 D85782 Hs.3229 cysteine dioxygenase, type I 1058 

429842 AI366213 Hs.173422 KIAA1 605 protein 1057 

412405 AW948128 gb:RC0-MT(X)13-280300-031-a12MT()013Homo 1055 

400615 1050 

425018 BE245277 Hs.154196 E4F transcription factor 1 1050 
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456011 BE243628 gb:TCBAP1D1053 Pediatric pre-B call acut 10.79 

455982 BE176882 gb:RC4-HT0587-17030(H)12-a04 HT0587 Homo 10.74 

450418 BE21S418 Hs501802 ESTs 10.73 

412490 AW803564 Hs588850 ESTs 10.72 

5 438962 AW377314 Hs5364 DKFZP564I052 protein 10.70 

437743 AI383497 Hs.131811 ESTs, Weakty similar to ALU1_HUMAN ALU S 1070 

449987 R40978 Hs571498 ESTs, Moderately similar to ALU LHUMAN A 1070 

449590 AA694O70 Hs568835 ESTs 10.68 

446035 NMJJ06558 Hs. 13565 SamB8-0ke phosphotyrosine protein, T-ST 10.68 

10 426530 U24576 Hs. 170250 complement component 4A 10.66 

428600 AW663261 Hs. 15036 ESTs, Highly similar to AF1 61 358 1 HSPCO 10.64 

420090 AA220238 Hs.94986 rtbonuciease P (38kO) 10.64 

451593 AF151879 Hs.26706 CGI-121 protein 10.62 

438893 AF075031 Hs59327 ESTs 1032 

15 459324 AW080953 gb*Jtc28c12.x1 NCLCGAP_Co1 8 Homo sapiens 10.61 

439883 AL359652 Hs.171096 Homo sapiens EST from clone DKFZp434A041 1058 

406513 AA715328 HS5912Q5 ESTs 1057 

407826 AA128423 Hs.40300 caipain 3, (p94) 1057 

419550 D50918 Hs.90998 KIAA0128 protein; septin 2 1056 

20 428522 R10184 Hs.191987 ESTs, Weakly similar to ALU1_HUMAN ALU S 1056 

459526 AI142350 Hs.146735 EST 1055 

411448 AA178955 Hs571439 ESTs 10.54 

410102 AW248508 HS579727 ESTs; 1052 

406577 1052 

25 408405 AK001332 Hs.44672 hypothetical protein FU1 0470 1051 

428966 AF059214 Hs.194687 cholesterol 25-hydroxylase 1050 

400880 10.48 

415875 AA894876 Hs.5687 protein phosphatase 1B {formerly 2C), ma 10.48 

434715 BE005346 Hs.1 16410 ESTs 10.46 

30 406851 AA609784 Hs. 180255 major histocompatibility complex, class 10.44 

413409 AI638418 Hs.21745 ESTs 1044 

418489 U76421 Hs.85302 adenosine deaminase, RNA-speciflc, B1 (h 10.44 

419465 AW500239 Hs.21187 Homo sapiens cDNA* FU23068 fis, clone L 10.44 

419544 AI909154 gb:QV-BT2(XH)10499-007BT200Homosapien 10.44 

35 432180 Y18418 Hs572822 RuvB (E coll homoiog)-0ke 1 10.44 

413822 R08950 Hs572044 ESTs, Weakly similar to ALU1_HUMAN ALU S 10.42 

437446 AA788946 Hs.16869 ESTs, Moderately similar to CA1C RAT COL 10.41 

415701 NML003878 Hs.78619 gamma-glutamyi hydrolase (conjugase, fol 10.41 

443790 NM_0O3500 Hs.9795 acyKJoenzyme A oxidase 2, branched chai 10.40 

40 458873 AW150717 Hs596176 STAT Induced STAT Inhibitor 3 1058 

415082 AA160000 Hs.137396 ESTs 1057 

429124 AW505086 Hs.196914 minor hlstocompafibliity antigen HA-1 1056 

417187 AB011151 Hs.81505 KIAA0579 protein 1054 

426827 AW067805 Hs.172665 methylenetetrahydrofolate dehydrogenase 1054 

45 424280 NMJXJ0Q30 Hs571366 eJantne-fllyoxylate ejninotransferase homo 1053 

446099 T93098 Hs.17126 ESTs 10.32 

423445 NM.014324 Hs.128749 atpha-methylacyl-CoA racemase 1051 

409995 AW960597 Hs.30164 ESTs 10.30 

432242 AW022715 Hs.162160 ESTs, Weakly similar to ALU4.HUMAN ALU S 1050 

50 406394 AA172106 Hs.1 10950 Rag C protein 1050 

406189 10.29 

422283 AW411307 Hs.114311 CDC45 (cell division cycle 45, S.cerevis 1056 

401598 AA172106 Hs.1 10950 Rag C protein * 1056 

456995 T89832 Hs.170278 ESTs 1026 

55 416511 NM_006762 Hs.79356 Lysosomal-associated muitispannfng membr 1054 

427274 NMJXJ521 1 Hs.174142 colony stimulating factor 1 receptor, fo 1054 

401384 1053 

456226 D 13168 Hs.82002 endothelin receptor type B 1052 

426928 AFO37062 Hs.172914 retinol dehydrogenase 5 (11 -cisand9-cls 1051 

60 423032 AI6S4746 Hs,1 19274 ESTs 1050 

436556 AI364997 Hs.7572 ESTs 1050 

418400 BE243026 Hs. 30 1989 K1AA0248 protein 10.19 

437401 AA757196 Hs.121190 ESTs 10.19 

403690 10.17 

65 423790 BE152393 gb^M2-HT0323-171199-033-a08HT0323Homo 10.16 

434094 AA305599 Hs538205 hypothetical protein PRO2013 10.16 

434967 AW975009 Hs592274 ESTs 10.16 

432827 Z68128 Hs.3109 Rho GTPase activating protein 4 10.16 

432660 AI288430 Hs.64004 ESTs 10.14 
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452234 AW084176 Hs.223296 ESTs 10.14 

445629 AI245701 gb:qk31K>5.x1 NCI_CGAPJ<id3 Homo sapiens 10.13 

457238 AA826142 Hs.179991 ESTs, Weakly similar to KPCE_HUMAN PROTE 10.13 

444605 AI174603 Hs.254105 enolasel, (alpha) 10.12 

S 450313 AI038989 Hs.24809 hypothetical protein FU10826 10.12 

407482 NM_006056 10.12 

449971 AAB07346 Hs£88581 Homo sapiens cDNA FU14296 (is, clone PL 10.11 

441201 AW118822 Hs.128757 ESTs 10.10 

435157 AW014605 Hs.179872 ESTs 10.10 

10 417308 H60720 Hs.81892 KIAA01 01 gene product 10.09 

442582 AI204266 H 3. 179303 ESTs 10.05 

437252 AI433833 Hs.164159 ESTs, Weakly similar to ALU1_HUMAN ALU S 10.04 

448663 BE614599 Hs.106823 H.sapiens gene from PAC 42616, similar t 10.04 

434467 BE552368 Hs.231853 Homo sapiens cDNA RJ13445 fis, clone PL 10.04 

15 423698 AA329796 Hs.109B DKFZp434J1813 protein 10.02 

412707 AW206373 Hs.16443 Homo sapiens cDNA: FU21721 fis, done C 10.00 

414658 X58528 Hs.76781 ATP-binding cassette, sub-family D (ALD) 10.00 

421832 NM_016098 Hs. 108725 HSPC040 protein 10.00 

423554 M90516 Hs.1674 glutamine-frucJose*6i)hosphatetransamln 10.00 

20 452039 AI922988 Hs.172510 ESTs 10.00 . 

434673 AW137442 Hs.136965 ESTs 10.00 

427976 AA418280 Hs.180040 Homo sapiens cDNA: FU22439 fis, clone H 10.00 

457803 BE501815 Hs.198011 ESTs 9.99 

428279 AA425310 H 3. 155766 ESTs 9.98 

25 444412 AI147652 Hs.21 6381 Homo saplBns clone HH409 unknown mRNA 9.98 

417049 N72394 Hs.44862 ESTs 9.96 

427509 M62505 Hs.21 61 complement component 5 receptor 1 (C5a I 9.96 

445424 ABQ28945 Hs.12696 cortactin SH3 domain-binding protein 9.96 

443678 AW009805 Hs.231923 ESTs 9.96 

30 447567 AW474513 Hs.224397 ESTs, Weakly similar to B4801 3 proline-r 9.94 

414709 AA704703 Hs.77031 Sp2 transcription factor 9.94 

434596 T59538 gb:yb65g12.s1 Stratagene ovary (937217) 9.94 

427630 BE2761 15 Hs.144980 ESTs, Weakly similar to CA13_HUMAN COUA 9.93 

416111 AA033813 Hs.79018 chromatin assembly factor 1 , subunlt A ( 9.92 

35 423349 AF010258 Hs. 127428 homeoboxA9 9.92 

424308 AW975531 Hs. 154443 minichromosome maintenance deficient (S. 9.92 

416814 AW192307 Hs.80042 dolichyl-P-GlcJvtan9GIcNAc2-PP-dolichylg[ 9.90 

417986 AA481003 H 3.97 128 ESTs 9.90 

425174 D87450 H 3. 154978 KIAA0261 protein 9.90 

40 438171 AW976507 Hs.293515 ESTs 9.90 

421984 AW972187 Hs.110443 hypothetical protein BJ2221 5 9.89 

408597 NM_005291 Hs.46453 G protein-coupled receptor 17 9.88 

413907 AI097570 Hs.71222 ESTs 9.87 

451298 AW801383 Hs.118578 H .sapiens mRNA for ribosomal protein L18 9.86 

45 433409 AI278802 Hs.25661 ESTs 9.85 

450360 AW1 17416 Hs.245484 ESTs 9.85 

433104 AL043002 Hs.128246 ESTs, Moderately similar to unnamed prat 9.84 

449824 AI962552 Hs.226765 ESTs 9.84 

452744 AI267652 Hs.30504 Homo sapiens mRNA; cONA DKFZp434E032 (f r 9.82 

50 431066 AF026273 Hs.249175 interteukin-1 receptor-associated kinase 9.82 

426457 AW894667 Hs.169965 chimerfn (chimaerin) 1 9.80 

443371 A1792888 Hs.145489 ESTs 9.80 

437159 AL050072 gb:Homo sapiens mRNA; cDN A DKFZp566E1 346 * 9.75 

425242 D13635 Hs.155287 KIAA0010 gene product 9.74 

55 447498 N67619 Hs.43687 ESTs 9.74 

426759 AI590401 Hs.21213 ESTs 9.73 

435129 AI381659 Hs.267086 ESTs 9.72 

437672 AW748265 H s. 5741 flavohemoprotein b5*b5R 9.72 

438209 AL120659 Hs.6111 KIAA0307 gene product 9.72 

60 438440 AA807228 Hs.225161 ESTs 9.72 

449720 AA311152 Hs.288708 ESTs; Weakly similar to KIAA0226 [H.sapl 9.72 

414291 AI289619 Hs.13040 ESTs * 9.72 

438206 AK001451 Hs.265561 CD2-associated protein 9.70 

446896 T15767 Hs.22452 Homo sapiens cDNA:FU21084 fis, clone C 9.70 

65 412667 AW977540 Hs269254 ESTs 9.70 

423301 S67580 Hs.1645 cytochrome P450, subfamily IVA, polypept 9.67 

440757 AW118645 Hs.160004 ESTs 9.67 

441412 AI393657 Hs.159750 ESTs 9.66 

421044 AF061871 Hs.101302 coflagsn, type XII, alpha 1 9.66 
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414726 BE466863 Hs.280099 ESTs 9.66 

418485 R91679 Hs. 124981 ESTs 9.66 

433480 X02422 Hs.181125 Immunoglobulin lambda focus 9.65 

441530 AI248301 Hs.127112 ESTs 9.65 

5 433533 053304 Hs.85394 ESTs 9.65 

421470 R27496 Hs.1378 annex!nA3 9.64 

438613 C05569 Hs.243122 hypothetical protein FU13G57 similar to 9.64 

429324 AA488101 Hs.199245 inactivation escape 1 9.62 

450244 AA007534 Hs. 125062 ESTs 9.62 

10 407660 AW063190 Hs.279101 ESTs 9.61 

406554 9.60 

426404 AA377607 Hs.273138 ESTs 9.58 

447045 AW392394 HS.27B569 KIAA0064 gene product 958 

449894 AK001578 Hs.24129 hypothetical protein FU1071 6 9.58 

IS 448376 AI494332 Hs.196963 ESTs 9.58 

407902 AL117474 Hs.41181 Homo sapiens mRNA; cDNA DKFZp727C191 (fr 956 

446572 AV659151 Hs.282961 ESTs 9.56 

459245 BE242623 Hs.31939 manic fringe (DrosophHa) homolog 9.55 

423545 AP000692 Hs.129781 chromosome 21 open reading frame 5 954 

20 414697 BE266134 Hs.76927 transiocase of outer mitochondrial membr 954 

410846 AW807057 gb:MR4-ST0062-031199^)18-b03ST0062Homo 952 

421181 NM_005574 Hs.184585 UM domain only 2 (rhombotin-5ke 1) 952 

427308 D26067 Hs.174905 KIAA0033 protein 952 

415995 NM_004573 Hs.994 phosphoRpase C, beta 2 951 

25 434846 AW295389 Hs.119768 ESTs 951 

414342 AA742181 Hs.75912 Homo sapiens cDNA: FU22199 fis, clone H 950 

416959 D28459 Hs.60612 ubiquitin-conjugating enzyme E2A (RA06 h 950 

443123 M094538 Hs.6588 ESTs 9.50 

439312 AA833902 Hs.270745 ESTs 9.48 

30 449375 R07114 Hs.271224 ESTs 9.48 

436357 AJ 132085 gb:Homo sapiens mRNA for axonemal dyneln 9.44 

458723 AW137726 Hs.244352 ESTs, Moderately similar to laminin alph 9.44 

457526 AW450584 Hs.192131 ESTs, WeaWy similar to RIBB [H.sapians] 9.43 

404741 9.43 

35 422409 NM_005428 Hs.1 1 6237 vav 1 oncogene 9.43 

403708 9.42 

408806 AW847814 Hs.289005 Homo sapiens cDNA: FU21532 fis, clone C 9.42 

417380 T06809 gb:EST04698 Fetal brain, Stratagene (cat 9.42 

422501 AA354690 Hs. 144957 ESTs 9.42 

40 426197 AA004410 Hs.167835 acyl-Coenzyme A oxidase 1 1 palmitoyi 9.42 

452624 AU076606 Hs.30054 coagulation lactor V (proaccelerin, labi 9.42 

412110 AW893569 gb:RCO-NN0O21-O4O4O(H)21-c10NN0021 Homo 9.41 

414158 AA361623 Hs.288775 Homo sapiens cDNA FU13900 fis, clone TH 9.41 

408101 AW968504 Hs.123073 CDC2«retaled protein kinase 7 9.40 

45 414171 AA360328 Hs.B65 RAP1A, member of RAS oncogene family 9.40 

415947 U04045 Hs.78934 mutS (E. coil) homolog 2 (colon cancer, 9.40 

426959 BE262745 gb:601 153869F1 NIH_MGCJ9 Homo sapiens c 959 

417519 AI689987 Hs.177669 ESTs, Weakly similar to RMS1_HUMAN REGUL 9.39 

457181 BE514362 Hs.296422 FK506-binding protein 3 (25kD) 959 

50 402835 958 

404632 9.38 

446566 H95741 Hs.17914 Homo sapiens cDNA: FU22801 fis, clone K 957 
455369 AW903533 gb:CM1-NN1031-060400-178-d05NN1 031 Homo * 9.37 

444001 AI095087 Hs.152299 ESTs, Moderately similar to ALU5_HUMAN A 9,38 

55 458191 AI420611 Hs.127832 ESTs 9.36 

431374 BE258532 HS251871 CTP synthase 954 

429327 AA283981 Hs.1 99248 prostaglandin E receptor 4 (subtype EP4) 9.33 

407061 X97748 gbHsapiens PTX3 gene promoter region. 9.33 

416967 BE616731 Hs.80645 Interferon regulatory factor 1 9.33 

60 423013 AW875443 Hs.22209 secreted modular calcium-binding protein 9.33 

439461 AA693960 Hs.103158 ESTs 9.33 

416830 BE513731 Hs58959 Human DNA sequence from clone 967N21 on 9.32 

422763 AA033699 H&53938 ESTs, Mcxierately.sirnilar to MASP-2 [H.sa 9.32 

442739 NML007274 Hs.8679 cytosoiic acyl coenzyme A thioestsr hydr 9.32 

65 452859 AI300555 Hs.288158 Homo sapiens cDNA: FU23591 fis, done L 9.32 

403237 9.32 

415000 AW025529 H&239812 ESTs, Weakly similar to CALM_HUMAN CALMO 9.31 

417951 AW976410 H&289069 Homo sapiens cDNA: FU21016 fis, clone C 9.30 

419066 Z98492 Hs.6975 PRO1073 protein 950 
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448443 AW167128 Hs231934 ESTs 9.30 

405125 9.30 

409768 AW499568 gb:UI-HF-BRQp<ii-h-03KHJI.r1 NIH_MGC„5 928 

453708 AI191811 Hs.54629 ESTs 928 

5 442271 AF000652 Hs.8180 syndecan binding protein (syntentn) 927 

410055 AJ250839 H&S8241 gene for serine/threonine protein kinase 926 

448692 AWO13907 Hs224276 ESTs t Moderately similar to predicted us 926 

417381 AF164142 Hs.82042 solute carrier (amity 23 (nucteobase tra 925 

422497 D29642 Hs.1528 KIAA0053 gene product 925 

10 414140 AA281279 Hs23317 ESTs 924 

435980 AF274571 Hs.129142 ESTs; Weakly similar to DEOXYRIBONUCLEAS 924 

458530 BE395035 Hs. 199889 ESTs, Weakly similar to KIAA0874 protein 924 

402585 924 

420819 AA260700 gb:zs95h11.s1 NCI_CGAP_GCB1 Homo sapiens 9.23 

15 444755 AA431791 Hs.183001 ESTs 922 

411630 U42349 Hs.71119 Putative prostate cancer tumor suppresso 922 

421246 AW582962 Hs.300961 ESTs, Highly similar to AF151 805 1 CGI-4 920 

421924 BE514514 Hs.10S606 coronin, actin-binding protein, 1A 9.19 

414888 AL039185 Hs.77558 thyroid hormone receptor interactor 7 9.18 

20 434267 AI206589 Hs.1 16243 ESTs 9.17 

409213 U61412 Hs.51133 PTK6 protein tyrosine kinase 6 9.17 

428242 H55709 Hs2250 leukemia Inhibitory factor (cholinergic 9.16 

451736 AW080356 Hs293684 ESTs, Weakly similar to alternatively sp 9.15 

413627 BE182082 Hs246973 ESTs 9,14 

25 416134 AA528402 Hs.74861 activated RNA polymerase It transcrtptfo 9.14 

449251 AW151660 Hs.31444 ESTs 9.14 

452813 U54727 Hs.191445 ESTs 9.14 

443622 AI911527 Hs.11805 ESTs 9.14 

413260 BE075281 gb:PM1-BTO585-29()2OOK)0^iO7 BT0585 Homo 9.12 

30 413450 Z99716 Hs.75372 N*awtytgaiactosaminidase, alpha- 9.12 

446442 BE221533 Hs257858 ESTs 9.12 

438540 AA810021 Hs,136906 ESTs 9.12 

426251 M24283 Hs.168383 Intercellular adhesion molecule 1 (CD54) 9.11 

410290 AA402307 Hs.73818 ubkjuinol-cytochrome c reductase hinge p 9.10 

35 437398 AA913736 Hs.126715 ESTs 9.10 

421559 NM.014720 Hs.1 05751 Ste20-reiated serine/threonine kinase 9.10 

439699 AF086534 Hs.187561 ESTs, Moderately similar to ALU1 JHUMAN A 9.10 

430799 C19035 Hs.164259 ESTs 9.09 

424544 M88700 Hs. 150403 dopa decarboxylase (aiomatfc L-amino acl 9.08 

40 453942 AW190920 Hs.19928 ESTs 9.08 

425844 T68073 Hs.159628 serine (or cysteine) proteinase inhibito 9.08 

434658 AI624436 Hs.194488 ESTs 9.07 

453999 BE328153 Hs240087 ESTs 9.06 

436490 R71543 Hs.18713 ESTs 9.05 

45 409192 AA065131 Hs233439 ESTs, Weakly similar to ALU7_HUMAN ALU S 9.05 

446223 BE300091 Hs.119699 hypothetical protein RJ12969 9.04 

447247 AW369351 H&287955 Homo sapiens cONA FU 13090 tls, clone m 9.04 

450094 AI174947 Hs295789 Homo sapiens mRNA; cDMA DKFZp564D1164 (f 9.04 

432012 AW301344 Hs. 195969 ESTs 9.04 

50 422520 AU076730 Hs.1 17977 Wnesin 2 (60-70kD) 9.02 

418650 BE386750 Hs.86978 prolyl endopeptidase 9.02 

423008 M81590 Hs.123016 5-hydroxytryptamine (serotonin) receptor 9.02 

436476 AA326108 H8S3631 ESTs * 9.02 

448206 BE622585 Hs.3731 ESTs 9.02 

55 431574 AW572659 Hs261373 adenosine A2b receptor pseudogene 9.01 

443453 R99876 Hs269882 ESTs 9.01 

435472 AW972330 Hs283022 triggering receptor expressed on myeloid 9.01 

420337 AW295840 Hs. 14555 Homo sapiens cDNA:RJ21513fis, clone C 9.00 

449810 AB008681 H&23994 activin A receptor, type ilB 9.00 

60 406780 AA9Q2386 Hs286 rlbosomal protein L4 8.99 

429169 AW341130 Hs.197757 ESTs, Moderately similar to FGFE.HUMANF 8.99 

421326 AF051428 Hs.1 03504 estrogen receptor 2 (ER beta) 8.97 

425491 AA883316 Hs255221 ESTs 8.96 

425516 BE000707 Hs29567 ESTs 8.96 

65 439773 AI051313 Hs. 1433 15 ESTs 8.96 

443247 BE614387 Hs.47378 ESTs 8.96 

456623 AI084125 Hs. 108 106 transcription factor 8.95 

438707 108239 Hs.5326 porcupine 8.95 

402240 8.85 
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444152 AI125694 Hs.149305 Homo sapiens cDNA FU14264 lis, clone PL 8.95 

409842 AW501758 gb:UI-HF-BROp-a|frH>09-0-Ul.r1 NIH_MGC_5 8.94 

416277 W78765 Hs.73580 ESTs 8.94 

456697 AI908006 Hs.111334 ferritin, fight polypeptide 8.94 

5 410762 AF226053 Hs.66170 HSKM-B protein 8.92 

412942 AL 120344 Hs.75074 mftogen-activated protein Wnase-activat 8.92 

442320 AI287817 Hs.129636 ESTs 8.92 

449673 AA002064 Hs.18920 ESTs 8.91 

411486 N85785 Hs.181165 eukaryotic translation elongation factor 8.90 

10 437916 BE566249 Hs.20999 Homo sapiens cDNA: RJ23142 fis, clone L 8.90 

442732 AA257161 Hs.8658 hypothetical protein OKFZp434E0321 8.89 

419741 NM 007019 Hs.93002 ubiquitin carrier protein E2-C 8,89 

411499 AW849292 gb:IL3-CT0^15^2030(M)9^EO3CT0215Horno 8.89. 

431154 AW971228 Hs.290259 ESTs 8.89 

15 414922 D00723 Hs.77631 glycine cleavage system protein H (amino 8.88 

418036 Z37976 Hs.83337 latent transforming growth factor beta b 8.87 

406422 8.87 

422926 NM_016t02 Hs.121748 ring finger protein 16 B.87 

435220 D50030 Hs.104 HGF activator 8.86 

20 418203 X54942 Hs.83758 COC28 protein kinase 2 8.86 

418613 AA744529 Hs.86575 mitogen^activatBd protein kinase kinase 8.85 

439250 H66566 Hs.271711 ESTs 8.85 

432359 AA076049 Hs^74415 Homo sapiens cDNA RJ10229 fis, clone HE 8.84 

450000 AI952797 Hs.10888 .Homo sapiens cDNA: RJ21559 fis, done C 8.83 

25 425657 T89839 Hs.119471 ESTs 8.83 

425694 U51333 Hs.159237 hexoWnase 3 (white cell) 8.82 

419972 AL041465 Hs.294038 ESTs, Moderately similar to ALU2JWMAN A 8.82 

436396 A1683487 Hs.2991 12 Homo sapiens cDMA HJ1 1441 fis, clone HE 8.82 

413413 D82520 Hs.301834 Homo sapiens cDNA FU10952 fis, clone PL 8.82 

30 428807 AA435997 Hs.104930 ESTs 8.82 

415839 R40611 Hs.137565 ESTs 8.81 

419553 N34145 Hs.250614 ESTs 8.80 

420309 AW043637 Hs.21766 ESTs 8.80 

421863 AI952677 Hs.108972 Homo sapiens mRNA; cDNA DKFZp434P228 (fr 8.80 

35 447965 AW292577 Hs.94445 ESTs 8.80 

459172 BE063380 gb:PM0-BT0275-291099^)02-g10BT0275Homo 8.80 

403259 8.78 

411534 AW850473 gb:IL3-CT0218-280100-061-B11 CT0219Homo 8.78 

456161 BE264645 Hs.282093 Homo sapiens cDNA: FU21918 fis, clone H 8.77 

40 413654 AA331881 Hs.75454 peroxlredoxin 3 8.76 

401744 8.76 

425348 AL137477 Hs.155912 cadherWke24 8.76 

423396 AI382555 Hs. 127950 broriwdomain-contalning 1 8.75 

450649 NM_001429 Hs.297722 Human DNA sequence from clone RP1-85F18 8.75 

45 408331 NM 007240 Hs.44229 dual specificity phosphatase 12 8.74 

423872 AB020316 Hs.134015 uronyl2-sulfotransferase 8.74 

424906 AI566086 Hs.153716 Homo sapiens mRNA for Hmob33 protein, 3* 8.74 

427596 AA449506 Hs.179765 Homo sapiens mRNA; cDNA DKFZp586H1921 (f 8.73 

432488 AA551010 Hs.216640 ESTs 8.72 

50 448980 AL137527 Hs.22703 Homo sapiens mRNA; cDNA DKFZp434P1018 (f 8.72 

429455 AI472111 Hs.292507 ESTs 8.71 

429855 AW385597 Hs.138902 ESTs, Weakly similar to B34087 hypotheti 8.71 

441746 H59955 Hs.127829 ESTs " 8.70 

411945 AL033527 Hs.92137 v-myc avian myeiocytomatosis viral oncog 8.70 

55 413492 D87470 Hs.75400 KIAA0280 protein ' 8.70 

435706 W31254 Hs.7045 GL004 protein 8.70 

433741 AA609019 Hs.159343 ESTs 8.70 

426340 Z97989 Hs.169370 FYN oncogene related to SRC, FQR, YES 8.69 

422779 AA317036 Hs.41989 ESTs 8.67 

60 449785 AI225235 Hs£88300 Homo sapiens cDNA: FU23231 fis, clone C 8.67 

420144 AA811813 Hs.1 19421 ESTs 8.66 

420235 AA256756 Hs.31178 ESTs 8.66 

432606 NM.002104 Hs.3066 granzyme K (serine protease, granzyme 3; 8.66 

425762 BE244076 Hs. 159578 Homo sapiens mRNA for FLI00020 protein, 8.65 

65 427448 BE246449 Hs.2157 Wiskott-Aldrich syndrome (eczema-thrombo 8.64 

418033 W68180 H&259855 Homo sapiens cDNA FU12507 fis, clone NT 8.64 

429084 AJ001443 Hs.1 95614 splicing factor 3b, subunlt 3, 130kD 8.64 

417094 NM.O06895 Hs.81182 histamine N-methyttransferasa 8.64 

457277 NM.004736 Hs.227656 xenotroplc and polytroplc retrovirus rec 8.63 
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422631 BE218919 Hs.1 18783 hypothetical protein FU10688 8.63 

410879 AW785196 Hs21 5857 ring finger protein 14 8.63 

431585 BE242B03 H&262823 hypothetical protein FU10326 8.62 

401851 8.62 

5 401866 8.62 

407783 AW996872 Hs. 172023 a disintegrln and metalloproteinase doma 8.62 

408242 AA251594 Hs.43913 PIBF1 gene product 8.62 

422250 AW408530 Hs.1 13823 CtpX (caseinolytic protease X,E.co!i) 8.62 

430259 BE550182 Hs.127826 RalGEF-fike protein 3, mouse homolog 8.62 

10 452598 AI831594 Hs.68847 ESTs, Weakly similar to ALU7_HUMAN ALU S 8.62 

418541 AW749617 gb«C3-BT0502-1301(MMl12-g07BT0502Homo 8.60 

428839 AI767756 Hs.82302 ESTs 8.60 

429328 AA828402 Hs.47939 ESTs 8.60 

451491 AI972094 Hs.286221 Homo sapiens cDNA RJ13741 fo, clone PL 8.60 

15 452561 AI692181 Hs.49169 KIAA1634 protein 8.60 

420027 AF009746 Hs.94395 ATP-binding cassette, sub-family D (ALD) 8.60 

435205 X54136 Hs.181125 immunoglobulin lambda locus 8.60 

430900 U91939 Hs.248123 G protein-coupled receptor 25 8.60 

405074 8.59 

20 437991 A1479773 Hs.181679 ESTs 8.59 

436346 BE328882 Hs.193096 ESTs, Moderately similar to U119.HUMAN U 8.58 

411079 AA091228 gb:cchn2152.seq.F Human fetal heart, Lam 8.57 

418452 BE379749 Hs.85201 C-type (calcium dependent, carbohydrate- 8.56 

429109 AL008637 Hs.196352. neutrophil cytosolic factor 4 (40kD) 856 

25 448019 AW947164 Hs.185641 ESTs 856 

449865 AW204272 Hs.199371 ESTs 855 

431180 H55883 gb:yq94h03.r1 Soares fetal liver spleen 8.54 

445988 BE007663 Hs.13503 Inactivation escape 2 8.54 

405876 8.54 

30 407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 8.54 

414807 AI738616 Hs.77348 hydroxyprostaglandln dehydrogenase 15-(N 8.54 

425671 AF193612 Hs.159142 lunatic fringe (Drosophila) homolog 854 

452413 AW082633 Hs.212715 ESTs 854 

421620 AA4461B3 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair cross-complementing rode 851 

405552 851 

418068 AW971155 Hs.293902 ESTs, Weakly similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.14373 ESTs 850 

40 438887 R68857 Hs.265499 ESTs 850 

446468 AJ765B90 Hs.16341 ESTs; Moderately simflar to till ALU SUB 850 

446585 AV659397 Hs.282948 ESTs 850 

441896 AW891873 gb«M3-m'009(MW05(X)-173-b02rir0090Homo 850 

437718 AI927288 Hs.196779 ESTs 848 

45 420656 AA279098 Hs.187636 ESTs 8.48 

429303 AW137635 Hs.44238 ESTs 8.48 

450624 AL043983 Hs.125063 Homo sapiens cONA FU13825 fis ( clone TH 8.48 

452573 AI907957 Hs^87622 Homo sapiens cDNA FU14082 fis, clone HE 8.48 

456341 AA229126 Hs.122647 N-myristoyftransferase2 8.48 

50 423024 AA593731 Hs.75613 CD36 antigen (collagen type I receptor, 8.47 

446985 AL038704 Hs.156B27 ESTs, Weakly similar to ALU1JWMAN ALUS 8.46 

431778 AL080276 Hs.268562 regulatorof G-protetn signalling 17 8.46 
400268 " 8.46 

421828 AW891965 Hs.289109 dimethylargtnlnedirnethylaminohydrolase 8.45 

55 417022 NM_014737 Hs,80905 Ras association (RaIGDS/AF-6) domain fam 8.44 

421029 AW057782 Ha.293053 ESTs 8.44 

425171 AW732240 Hs500615 ESTs 8.44 

459070 AI814302 gbw|71 d 2 Jtl NCLCGAP_Lu19 Homo sapiens 8.42 

406006 8.42 

60 412643 AW971239 Hs.293982 ESTs 8.42 

424775 AB014540 Hs.153026 SWAP-70 protein 8.42 

446848 AW136083 Hs.195266 ESTs, Weakly similar to S59501 interfero 8.42 

448043 AI458653 Hs.201881 ESTs 8.41 

407183 AA358015 gb:EST66864 Fetal lung III Homo sapiens 8.40 

65 412324 AW978439 Hs.69504 ESTs 8.40 

419594 AA013051 Hs.91417 topoisomerase (DNA) II btndfog protein 8.40 

430968 AW972830 gb:EST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 Hs267695 UDP-Gal:betaGlcNAcbeta 1,3-galactosyltr 840 

438582 AI521310 Hs.283365 ESTs, Weakly similar to ALU5.HUMAN ALU S 840 
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447685 AL122043 Hs. 19221 hypothetical protein DKFZp566Q1424 8.40 

459119 AW844498 Hs289G52 Homo sapiens LENG8 mRNA, variant C, part 8.38 

400817 8.37 

425235 BE245297 gb:TCBAP1E2482 Pediatric pre-B cell acut 6.37 

S 409385 AA071267 gb:zm61g01.rl Stratagene fibroblast (937 8.36 

439121 BE047779 Hs.44701 ESTs 8.36 

419968 X04430 Hs.93913 Interleukin 8 (interferon, beta 2) 8.36 

408327 AW182309 Hs249963 ESTs, Highly similar to dJ1 170K4.4 [H.sa 8.35 

403976 8.34 

10 448064 AA379038 gb:EST91809 Synovial sarcoma Homo sapien 8.33 

442914 AW188551 Hs.99519 Homo sapiens cDNARJ14007fis, clone Y7 8.33 

428032 AW997704 Hs.11493 Homo sapiens cDNA FU13536 fis, clone PL 8.32 

434194 AF1 19847 Hs263940 Homo sapiens PRO1550 mRNA, partial cds 8.32 

458677 AW937670 Hs254379 ESTs 8.32 

15 420925 NM.015698 Hs.100391 T54 protein 8.30 

416475 T70298 gbryd26g02.s1 Scares fetal fiver spleen 8.30 

416852 AF283776 Hs.80285 Homo sapiens mRNA; cDNA DKFZp586C1723 (f 8.30 

430676 AF084666 gb:Homo sapiens envelope protein RIC-3 ( 8.30 

428455 AI732694 Hs.98520 ESTs 8.29 

20 435343 AW194952 Hs.199028 ESTs 8.29 

450783 BE266695 gb:601 190242F1 NIH_MGC_7 Homo sapiens cD 829 

404946 828 

422942 AF054839 Hs.122540 tetraspan2 8.28 

453716 AA037675 Hs.152675 ESTs 828 

25 437098 AA744488 Hs.132842 ESTs, Moderately similar to ALU1_HUMAN A 828 

443907 AU076484 Hs.9963 TYRO protein tyrosine Wnase binding pro 827 

401930 AF106069 Hs.23168 ubiquitin specific protease 15 826 

446554 AA1 51730 Hs.301789 ESTs, WeaWy similar to similar to C.ele 8.26 

426290 AB007918 Hs. 169 182 K1AA0449 protein 825 

30 419904 AA974411 Hs.18872 ESTs 825 

413886 AW958264 Hs.103832 ESTs, Weakly similar to TRHYJHUMAN TRICH 824 

424738 AI963740 Hs.46826 ESTs 824 

427359 AW020782 Hs.79881 Homo sapiens cDNA: FU23006 fis, clone L 824 

424534 D87682 Hs. 150275 KIAA0241 protein 824 

35 424429 U63830 Hs.146847 TRAF family member-associated NFKB activ 824 

442604 BE263710 Hs279904 ESTs 822 

442992 AI914699 Hs.13297 ESTs 822 

427210 BE396283 Hs, 1 73987 eukaryotic translation Initiation factor 822 

457229 BE222450 Hs266390 ESTs 821 

40 423730 AA330214 gb:EST33935 Embryo, 12 week It Homo sapl 821 

411928 AA888624 Hs.19121 adaptor-related protein complex 2, alpha 820 

416051 AA835858 Hs25253 Homo sapiens cDNA: FU20935 fis, clone A 820 

417231 R40739 Hs21326 ESTs 820 

422049 W25760 Hs.77631 glycine cleavage system protein H (amino 820 

45 427528 AU077143 Hs.179565 minlchrornosome maintenance deficient (S. 820 

458776 AV654978 Hs.19904 cystathlonase (cystathionine gamma-lyase 8.19 

417687 AI828596 Hs250691 ESTs 8.18 

423218 NNL015896 Hs.167380 BLu protein 8.18 

425397 J04088 Hs.1 56346 topoisomerase (DNA) II alpha (170kD) 8.18 

50 406964 M21305 Hs247946 Human alpha sateirrie and satellite 3 (u 8.18 

402401 U42349 Hs.71118 Putative prostate canoer tumor suppress© 8.18 

423397 NM 001838 Hs.1652 chemokine (C-C moffi) receptor 7 8.18 

427857 AU33017 Hs2210 thyroid hormone receptor Interactor 3 - 8.17 

401519 8-17 

55 447188 H65423 Hs.17631 Homo sapiens cDNA FU201 18 fis, clone CO 8.16 

424704 AI263293 Hs. 152096 cytochrome P450, subfamily IIJ (arachido 8.16 

435854 AJ278120 Hs.4996 DKFZP564D1 66 protein 8.14 

448556 AW885606 Hs.5064 ESTs 8.14 

449217 AA278536 Hs23262 ribonudease, RNase A family, k6 8.14 

60 453124 AI139058 Hs23296 ESTs 8.14 

442812 AI018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs.89271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigeneff) in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



408057 103572CL-1 AW139555 

408069 103655 J H81795 Z42291 R20973 AA046920 

408182 104479 1 AA047854 AA057506 AA053841 

408338 1052148 _1 AW867079 AW867086 AW182772 

408828 1 08463 1 8E540279 AW410659 AA057857 R77693 BE278674 

409126 1 10159J AA063426 AW962323 AW408063 AA063503 AA772927 AW753492 BE175371 AA31 1 147 

409292 111586 1 AA071051 AA070584 AA069938 AA102136 AA074430 

409314 111841 1 AA07Q266 AA0849S7 AA126998 

409385 1 12523 1 AA071267 T65940 T64515 AA071334 

409398 1126716J AW386461 AW876408 AW386672 AW386599 AW876258 AW386619 AW386289 AW876136 AW876203 AW876213 AW876301 

AW876295 AW876349 AW876365 AW876160 AW876369 AW876352 AW876271 

409671 114731J AA076769 AA076781 AI087968 

409768 1154035.1 AW499566 AW502378 AW499522 AW502046 AW5Q2671 AW501917 AW501868AW501721 AW502813 

409841 1156088J AW502139 AW502432 AW502235 AW501683 AW5Q2647 

409842 1166119J AW501756 AW502096 AW502465 AW501715 
409853 1 156228 1 AW502327 AW502488 AW501829 AW502625 AW502687 
410531 1207200J AW752853 H88044 BE156092 

410688 1216101.1 AW796342 AW796356 BE161430 

410846 1223902 1 AW807057 AW807054 AW807189 AW807193 AW807369 AW807429 AW807364 AW807365 AW807078 AW807256 AW807180 
AW807331 

410896 1226053 1 AW809637 AW809697 AW810554 AW809707 AW809885 AW810000 AW810088 AW809742 AW809816 AW809749 AW809639 
AW809722 AW809836 AW809774 AW810023 AW810013 AW809813 AW609660 AW809728 AW809768 AW809951 AW809657 
AW809954 

411079 123128 1 AA091228 H71860 H71073 

411424 1245497J AW845985 AW845991 AW845962 

411499 1248105 1 AW849292 AW849431 AW849422 AWB49428 AW849420 AW849424 AW849427 

411507 1248807 J AW850140 AW850195 AW850192 

411534 1248827 1 AW850473 AW850471 AW850431 AW850523 

411972 1268491J BE074959 AW880160 

412110 1277844.1 AW893569 AW893571 AW893588 AW893593 

412226 1284289J W26786AW998812 AW902272 

412257 1285376.1 AW9Q3830 BE071916 tMMtl4M 

412405 1293012J AW948126 AW948139 AW948196 AW948145 AW948162 AW948134 AW948127 AW948124.AW948153 AW948157 AW948125 

AW948131 AW948158 AW948164 AW948151 

413260 1356003J BE075281 BE075219 BE075123 BE075119 BE075046 

413471 1371778J BE142098 BE142092 

413729 1385114J BE159999BE1 60056 BE1 60107 BE1601 39 

414182 142409J AA136301 AI381776 AA136321 

414989 1511339J T81668C19040C17569 

415354 1534763.1 F06495 R24336 R13046 

416011 1566439.1 H14487 R50911 Z43216 

416475 1596398J T70298 H5B072 R02750 

417380 1672461J T06809 N75735 

419392 1843934.-1 W28573 

419541 185724.1 AW749617R64714AA244138AA244137BE094019 

419544 185760J2 A1909154 AA526337 AA244193 AI909153 

420819 196721J AA280700 AW975494 AA687385 

421245 200620.1 AA285363 AA285333 AA285359 AA285328 AA285350 

422673 219874.1 N59027AA314694 N53937 R08100 
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422695 
422650 
422940 
423730 
423790 
424365 
424606 
425265 
426959 
430876 



430966 
431180 
432093 
434596 
436357 
437159 
437495 
439097 
439120 
440134 



445629 
447229 
446064 
450783 
451045 
452549 
452560 



219996J 
222209J 
223106J 
231462J 
232031J 
238731J 
241409J 
249175.1 
273830.-1 
32168 1 



326269J 

328906 1 

341283.1 

38937J 

41842.1 

43393.1 

43765.1 

46858.1 

46879.1 

48675.1 

52842L.1 

645767.1 

71288J 

74761.1 

84655.1 

85673.1 

921802L1 

922216.1 



452712 928309.1 

453758 980026.1 

454093 1007366.1 

454563 1224342.1 

454791 1234759J 

454977 1247099.1 

455131 1254674.1 

455183 1259023J 

455254 1266449J 

455369 1285173.16 

455982 1396849.1 

456011 1410860.1 

456023 1416335.1 

457586 360505.1 

457595 364225.-1 

457751 399422.1 

459070 883688.1 

459081 889426.1 

459145 918957.1 

459172 921149.1 

459234 945240.-1 



AA315158 AW981298 N76067 AW802759 AI858495 W04474 

R35398BE252178AA318153 

BE077458 AA337277 AA319285 

AA330214AW962519 T54709 

BE152393 AA330984 BE073904 

AA339668 AW952809 AA3491 19 

AA343936 AA344060 AW963081 

BE245297 AA353976 AW505023 

BE262745 

AF084866 AT084870 AF084864 AF084667 AF084869 AF084865 AF084868 AW81 6206 AW812038 BE144613 BE144812 

AW812041 AW812040 AW812067 BE061583 BE061604 T05808 AI352469 AA580921 BE141783 BE 141 782 BE061601 

AW814393AW885029 

AW972830 AA527647 AA489820 AA570362 

K55883 AW971249 AA493900 H55788 

H28383 AW972670 H28359 AA525808 

T59538 T59589 T59598 T59542 AF147374 

AJ132085Z63805 

AL050072AW900148 

BE177778 BE177779 AL390180 AA359908 

H66948AF085954 H66949 

H56389AF085977 H56173 

BE410734 BE560117 BE270054 BE296330 BE267957 AI003007 BE545259 

AW891873 AWB91897 BE564764 

AI245701 BE272724 

BE617135AW504051 AW504283 

AA379036 AA150589 AI696854 BE621316 

BE266695 BE265474 N53200 BE267333 

AA215672 AI69S628 AA013335 H86334 AA017006 

AI907039 AI907081 

BE077084 AW 139963 AW863127 AW806209 AW806204 AW806205 AW806206 AW806211 AW806212 AW806207 AW806208 
AW806210AI807497 

AW838616 AW838660 BE144343 AJ 9 14520 AW888910 BE184854 BE184784 
U83527AL120938 U83522 

AW860158 AW862385 AW860159 AW862386 AWB62341 AW821869 AW821893 AW062660 AW062656 
AW807530 AW807540 AW807537 AW846086 BE141634 AW846089 AW807499 AW807533 AWB38499 
BE071874 BE071882 AW820782 AW821007 

AW848032 AW848630 AW848478 AW848623 AW848484 AW848169 AW848830 AW848149 AW8481 19 AW848893 AW848903 
AW848407 

AW857913 AW657916 AW857914 AW861627 AW861626 AW861624 
AW984111 AW863918AW863856 

AW877015 AW877133 AW876978 AW877071 AW876988 AW877069 AW877063 AW877013 

AW903533 AW903516 AW903562 BE085202 BE085215 BE085214 BE085209 BE085172 BE085175 BE085193 BE085211 
BE085199 

BE176862 BE176876 BE176947 BE176878 

BE243628 BE246081 BE247016 BE241984 BE241534 BE246091 BE245679 BE243620 BE245998 BE242329 BE241417 

BE241457 BE242522 BE241989 BE241464 

R00028BE247630 

AW062439 AW751554 AA579463 

AA584854 

AI908236 AA663731 

AI814302A1B14426 

W07808A1822066 

AI903354 AI903489 A1903488 

BE063380 BE063346 A1906097 

AI940425 
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TABLE 9B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ref; Sequence source. The 7 digit numbers in this column are Genbank Identifier (Ql) numbers. "Dunham I. et al." refers to the 

publication entitled The DNA sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

NLpositlon: Indicates nucleotide positions of predicted exons. 



Pkey Ref 


Strand 


NLposition 


400452 B1 13550 


Minus 


90308-90505 


400557 9801261 


Pius 


208453-208528,209633-209813 


400615 9908994 


Plus 


1 1 8036-1 1 81 66,1 18681-1 1 8807 


400802 8567867 


Minus 


17457M74856 


400817 85699S4 


Plus 


170793-170948 


400880 9931121 


Plus 


29235*29336,36363-36580 


400885 9958187 


Minus 


58242-58733 


400926 7651921 


Minus 


52033-52158,53956^54120^4957-55052^5420-55480^6452-56666,57221-57718 


400952 7658481 


Plus 


192667-192826,194387-194876 


400991 8096825 


Plus 


159197-159320 


401044 8117619 


Plus 


73501-73874 


401124 8570296 


Minus 


124181-124391 


401163 6981820 


Plus 


5302-5545 


401201 9743387 


Minus 


138534-138629,139234-139294,140121-140335,142033-142479 


401286 9801342 


Minus 


147036-147318 


401384 6850939 


Minus 


58360-58545 


401468 6433826 


Plus 


13056-13482 


401515 7630851 


Plus 


29929-30126 


401519 6649315 


Plus 


157315-157950 


401672 9838136 


Pius 


128526-128704,130755-130860 


401744 2576349 


Pius 


14595-14751 


401851 7770425 


Minus 


146443-146664,147794-147971,148351-148480,148980-149111,149801-149949 


401866 8018106 


Pius 


73126-73623 


402240 7690131 


Plus 


104382-104527,106136-106372 


402359 9211204 


Minus 


40403-41961 


4025B5 9908890 


Minus 


174893-175050,183210-183435 


402788 9796102 


Pius 


98273-101430 


402802 3287156 


Minus 


53242-53432 


402812 6010110 


Plus 


25026-25091,25844-25920 


402828 8918414 


Pius 


69071-69642 


402835 9187337 


Plus 


26961-27101 


402838 9369121 


Minus 


32589-32735,35478-35666 


402842 9369121 


Minus 


76355-76479 


402895 9967547 


Plus 


85537-85671,86379-86469 


402984 9581599 


Minus 


46624-46784 


403137 9211494 


Minus 


92349-92572^295B-93084,93579-93712,93949-94072,94591-94748,95214-95337 


403237 7637807 


Plus 


7271-7527 


403259 7770585 


Plus 


46934857 


403683 7331517 


Plus 


217175-217446 


403690 7387384 


Minus 


78627-79583 


403708 5705981 


Minus 


134394-134812 


403838 4176355 


Plus 


19197-19502 


403851 7708872 


Plus 


22733-23007 


403976 7657840 


Pius 


24755-24969 


404407 7329316 


Minus 


48154-48499 


404426 7407959 


Pius 


77842-77954 


404632 9796668 


Plus 


4509645229 


404741 8574139 


Plus 


143025-143467 


404756 7706327 


Plus 


82849-83627 


404946 7382189 


Plus 


134445-134750 


405074 7770440 


Plus 


44340-44559,44790-45059 


405125 8247873 


Plus 


137113-137814 


405172 9966752 


Plus 


153027-153262 
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405236 


7249076 


Minus 


151699-151915 


405325 


6094661 


Minus 


25818-26380 


405411 


3451356 


Minus 


17503-17778,18021-18290 


405495 


8050952 


Minus 


72182-72373 


405552 


1552506 


Plus 


46199-45647 


405601 


5815493 


Minus 


147835-147935,149220-149299 


405685 


4508129 


Minus 


37956-38097 


405777 


7263187 


Minus 


104773-105051 


405656 


7653009 


Plus 


101777-102043 


405876 


6758747 


Plus 


39694-40031 


405932 


7767812 


Minus 


123525-123713 


405934 


6758795 


Plus 


159913-160605 


406006 


8247801 


Minus 


42640-42776 


406134 


9163473 


Plus 


153291-153452 


406189 


7289992 


Minus 


22007-22234 


406422 


9256411 


Plus 


163003-163311 


406516 


7711422 


Minus 


12B375-128449,128560-128784 


406538 


7711478 


Pius 


35196-35367^8229-38476/0080-40216,43522-43840 


406554 


7711566 


Plus 


106956-107121 


406577 


7711730 


Plus 


11377-11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downxegulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD; Unigene number 

Unigene Title: Unigene gene title 

Eos: internal Eos name 

F0O-F14: passage number 



Pkey ExAccn 


UnigenelD UnlgenTHie Eos Resp.FOO 


F00 


F02 


F02 


F05 


F05 


F07 


F09 


F10 


F11 


F13 


F14 


117921 N51002 


Hs.47170 UprinA2 PM28UP 1 


9 


8 


9 


32 


20 


34 


122 


1<S 


82 


71 


111 


112971 T17185 


Hs.4299 ESTs CHA1 down 290 


281 


267 


335 


270 


284 


150 


157 


83 


89 


49 


75 


126645 AI167942 


Hs,61635 STEAP PAA5down106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


1 


1 


119018 N95796 


Hs.179809 ESTs PAB2 down 765 


841 


757 


909 


742 


704 


47B 


428 


253 


175 


228 


238 


110844 N31952 


Hs.167531 ESTs PAV7 down 175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


84 


100654 HG2841-HT2969 Hs.75442 Albumin, A PM01 down 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 HG2841-HT2970 Hs.75442 Albumin, A PM02down 620 


653 


486 


688 


368 


386 


606 


175 


101 


95 115 97 


102076 U09579 


Hs.252437 cydn-dep PM03down 101 


94 


143 


190 


105 


107 


88 


40 


34 


31 


46 


22 


102208 U22961 


HsJ5442 albumin PM04 down 495 


424 


323 


518 


252 


296 


467 


188 


169 


143 


165 


145 


103739 AA075779 


mitochondr PMOSdown 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 AA599690 


Hs.15725 SBBf48 PM06down67 


124 


115 


168 


132 


111 


66 


71 


49 


70 


38 


50 


108242 AA062746 


ESTs PM07down 14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


18 


108282 AA065143 


solute car PM08dovm27 


54 


178 


73 


108 


37 


53 


24 


14 


53 


15 


34 


108679 AA1 15963 


beta-1-gb PM09down680 


893 


1292 656 


869 


369 


1 


74 


118 


662 


359 


409 


108731 AA126313 


Hs.107476 ATPsynthaPM10down10 


19 


185 


25 


60 


1 


32 


3 


7 


14 


1 


1 


110875 H89355 


Hs.6598 adrenergic PM11 down 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 AA283804 


Hs.193552 ESTs PM12down 146 


316 


282 


271 


340 


334 


115 


238 


100 


196 


63 


207 


115844 AA430124 


Hs£34607 MDM2 PM13down 49 


93 


94 


154 


132 


91 


23 


54 


23 


76 


14 


41 


120588 AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


83 


35 


37 


16 


46 


132349 Y00705 


Hs.181288 serine pro PM15down146 


217 


214 


150 


106 


128 


177 


85 


54 


63 


66 
41 


56 


132888 AA490775 


H&5920 N-acetytrna PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


80 


132967 AA032221 


Hs.61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 


63 


133063 AA283085 


Hs£4065 ESTs PM18down85 


148 


161 


150 


92 


108 


42 


99 


42 


65 


29 


126 


134374 D62633 


Hs.8236 ESTs PM19down 230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 M23263 


H&99915 androgen r PM20down 36 


167 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 
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TABLE 1 1 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset Identifier number 

ExAccn; Exemplar Accession number, Genbank accession number 

UntgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Background subtracted normal prostata : prostate tumor tissue 



Pkey 


ExAccn 


UnigenetD 


101336 


149169 


Hs.75678 


130642 


M63438 


Hs.156110 


133512 


X01677 


Hs.195188 


133436 


H44631 


Hs.737 


129292 


X13810 


Hs.1101 


100610 


HG2566-HT4792 




133448 


M34516 


Hs.170116 


125193 


W67577 


Hs.84298 


133456 


T49257 


Hs.183704 


134546 


AA459310 


Hs.8518 


102131 


U15085 


Hs.1162 


101375 


Mi 3560 

111 1 vvW 


Hs.84298 


100674 


HG3Q33-HT3194 




134365 


R32377 


Hs.62240 


132335 


D80387 


Hs. 189885 


110303 


H37901 

1 lUl WW I 


Hs.32703 


131678 


N59162 


Hs.30542 


116599 


D80046 


Hs.250879 


133769 


M17733 


Hs.75966 


107904 


AA026648 


Hs.61389 


129427 


T80746 


Hs.1 11334 


105987 


AA406631 


Hs.1 10299 


131466 


F03233 


Hs27169 


102859 


X00274 


Hs.76807 


134626 


S82198 


Hs.8709 


134170 


M63138 


Hs.79572 


131713 


X57809 


Hs.181125 


100748 


HG3517-HT3711 




118769 


N74496 




111734 


R25375 


Hs.126916 


109221 


AA192755 


Hs.85840 


133846 


AA480073 


Hs.76719 


135281 


AA401575 


Hs.97757 


119073 


R32894 


HS45514 


100760 


HG3576-HT3779 




101426 


M19483 


Hs.25 


129568 


AA428025 


Hs.114360 


130900 


Z38468 


Hs.21036 


133879 


M13829 


Hs.77183 


100627 


HG2702-HT2798 




129424 


M55593 


HS.111301 


128652 


AA621245 


Hs.103147 


129979 


T72635 


Hs.13956 


133468 


X03068 


Hs.73931 


102636 


U67092 




129536 


M33493 


Hs.184504 


133599 


M64788 


Hs.75151 



Unigene Title 

FBJ murine osteosarcoma viral oncogene homotog B 

Immunoglobulin kappa variable 1D-8 

glyceraidehyde-3-phosphate dehydrogenase 

immediate early protein 

POU domain; class 2; transcription factor 2 

Microtubule-Assoclated Protein Tau, Alt Spllce~3, Exon 8 

immunoglobulin lambda-like polypeptide 3 

CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 

ublqutlinC 

Homo sapiens mRNA; cDNA DKFZp586L1722 (from clone 
DKFZP586L1722) 

major histocompatibility complex; class (I; DM beta 

CD74 antigen {invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 

Spliceosomal Protein Sap 62 

syntaxln 3A 

ESTs 

ESTs 

ESTs 

ESTs 

thymosin; beta 4; X chromosome 
ESTs 

ferritin; tight polypeptide 
mitogervactivated protein kinase kinase 7 
ESTs 

Human HLA-DR alpha-chain mRNA 

caldecrin (serum calcium decreasing factor; elastase IV) 

cathepsln D (lysosomal aspartyl protease) 

immunoglobulin lambda gene cluster 

Alpha-1-Antitrypsin, 5* End 

ESTs 

ESTs 

ESTs; Weakly similar to stac [H.saplens] 
U6 snRNA-associated Sm-like protein 
ESTs 

v-ets avian erythroblastosis virus E26 oncogene related 

Major Histocompatibility Complex, Class II Beta W52 

ATP synthase; H+ transprtng; mltochndrt F1 complex; beta polypept 

transforming growth factor beta-stimulated protein TSC-22 

ESTs; Moderately similar to F25965_3 [H^apiensJ 

v-raf murine sarcoma 3611 viral oncogene homotog 1 

Serine/Threonine Kinase (Gb'225424) 

matrix metalloproteinase 2 (gelatinase A; 72kD gelatinase; 

72kOtypelVcollagenase) 

ESTs; Weakly similar to simBar to SP:YR40_BACSU [Celegans] 
ESTs 

major histocompatibility complex; class II; DQ beta 1 
Human ataxia-telangiectasia locus protein (ATM) gene, exons 
1a, 1b, 2, 3 and 4, partial cds 
tryptase; alpha 

RAP1; GTPase activating protein 1 
194 



R1 

0.012 

0.015 

0.017 

0.017 

0.019 

0.02 

0.021 

0.022 
0.022 

0.023 
0.023 

0.023 

0.024 

0.027 

0.027 

0.028 

0.028 

0.029 

0.029 

0.03 

0.03 

0.03 

0.032 

0.032 

0.032 

0.033 

0.034 

0.034 

0.034 

0.036 

0.036 

0.036 

0.037 
0.037 
0.037 
0.038 
0.038 
0.039 
0.039 
0.039 

0.039 
0.039 
0.039 
0.04 

0.04 
0.04 
0.041 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



102104 
131340 
130446 
101352 
122593 
130181 
134071 

108129 

130511 

133336 
132982 
131880 
130540 
133467 
101191 
101860 
102799 

107200 
101166 
134289 
135329 
124950 

102919 
100574 
131286 
102675 

131332 
101634 
113118 
124884 
130523 
110244 
131932 
132509 
133372 
100817 
106748 
135401 
130479 
102589 
121521 
135340 
132336 
115368 
101278 
103284 
100564 
133132 
121811 
129613 
132468 
120111 



130386 
104275 
106305 
116431 
120339 
114427 
118821 
118979 
107495 
120240 



U12139 

AM78305 

X79510 

L77701 

AA453310 

R39552 

Z14093 

AA053252 

L32137 

AA291456 

L02326 

AA047034 

U35234 

AA258595 

L20688 

M95610 



Hs.25817 



D20350 

L14927 

M54915 

AA436026 

T03786 

X12447 

HG2279-HT2375 

AA450092 

U72512 

R50487 

M57731 

T47906 

R77276 

W76097 

H26742 

AA454980 

H09751 

AA291139 

HG4011-HT4804 

AM76438 

L14813 

R44163 

U62015 

AA412165 

AA425137 

AA342422 

AA282133 

L38487 

X80200 

HG2239-HT2324 

Z40883 

AA424635 

AA279481 

S79854 

W95841 

Z83741 

F10874 

C02170 

AA436146 

AA609878 

AA206465 

AA017063 

N79070 

N93798 

W78776 

Z41732 



Hs.16297 
Hs.128749 
Hs.151608 
Hs.78950 

Hs.185848 

Hs.1584 

Hs.71190 

Hs.198118 

Hs.33818 

Hs.159534 

Hs.73931 

HS.83656 

Hs.37165 



Hs.5628 

Hs.2099 

Hs.81170 

Hs.98858 

Hs.151531 

Hs.183760 

Hs^5300 



Hs£5717 

Hs.75765 

HS220512 

Hs.120911 

Hs.214507 

Hs.25367 

Hs.25601 

Hs.5038 

M8.72242 

Hs.7891 

Hs.169271 

Hs.12457 

Hs.8867 

Hs.97358 

Hs.99093 

Hs.45073 



Human atpha1{XI) collagen (COL11 A1) gene, 5 1 region and exon 1 

Homo sapiens chromosome 19; cosmld R27216 

protein tyrosine phosphatase; non-receptor type 21 

COX17 (yeast) homotog; cytochrome c oxidase assembly protein 

alpha-methytacyl-CoA racemase 

Homo sapiens clone 23622 mRNA sequence 

branched chain keto acid dehydrogenase E1 ; alpha polypeptide 



Hs. 110849 
Hs.8375 

Hs.65588 

Hs.98416 

Hs.238831 

Ks.49322 

Hs.136031 

Hs.248174 

Hs^34249 

Hs.39387 

Hs.12828 

Hs.55289 

Hs.256470 

Hs.94789 
Hs.43666 
Hs.90375 
Hs.66049 



ESTs; Weakly similar to H ALU SUBFAMILY J WARNING 
ENTRY II [Rsapiens] 

cartilage oligomeric matrix protein (pseudoachondroplasia; 

epiphyseal dysplasia 1; multiple) 

ESTs 

immunoglobulin lambda-like polypeptide 2 
RecQ protein-like 5 

protein tyrosine phosphatase; receptor type; S 
major histocompatibility complex; class II; DQ beta 1 
Rho GDP dissociation inhibitor (GDI) beta 
collagen; type IX; alpha 2 

Human endogenous retroviral H proteasertntegrase-derhrcd ORF1 
mRNA, complete cds, and putative envelope prot mRNA, partial cds 
ESTs 

Hpocatin 1 (protein migrating faster than albumin; tear prealbumin) 

pim-1 oncogene 

ESTs 

protein phosphatase 3 (formerly 2B); catalytic subuntt; beta tsoform 
(caJcrneurin Abeta) 



Triosephosphate Isomerase 

Homo sapiens clones 24718 and 24825 mRNA sequence 
Human B-cell receptor associated protein (hBAP) alternatively 
spliced mRNA, partial 3'UTR 
ESTs 

GR02 oncogene 
ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to ALR [H.saplens] 
chromodomaln helicase ONA binding protein 3 



ESTs 

Dystrophin-Associated Glycoprotein, 50 Kda, AIL Splice 2 
ESTs 



Homo sapiens clone 23770 mRNA sequence 
cysteine-rich; angiogenic inducer; 61 
EST 

Homo sapiens chromosome 19; cosmid R28379 
ESTs 

ESTs; Weakly simitar to similar to collagen [C.elegans] 

estrogen-related receptor alpha 

TNF receptor-associated factor 4 

Potassium Channel Proteirv(Gb211585) 

ESTs; Weakly similar to dJ393P125 [Rsaptens] 

ESTs 

ESTs; Weakly similar to collagen alpha 1 (XVIII) chain [M.musculus] 

deiodinase; iodothyrontne;type III 

ESTs 

H2A histone family; member M 

mitogen-activated protein kinase 8 Interacting protein 1 

ESTs; Weakly smlr to weak smlrity to ribosomal prot L14 [C.elegans] 

ESTs 

ESTs; Weakly smlr to 1 10 KD CELL MEMBRANE GLYCOPROTEIN 
EST 

ESTs; Highly similar to Miz-1 protein [Haptens] 
ESTs 

protein tyrosine phosphatase type IV A; member 3 

ESTs 

ESTs 



0.041 
0.041 
0.042 
0.042 
0.042 
0.042 

0.042 

0.043 

0.043 
0.043 
0.044 
0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.044 
0.044 
0.044 

0.044 
0.044 
0.045 
0.045 

0.045 
0.045 
0.046 
0.046 
0.046 
0.046 
0.046 
0.046 
0.046 
0.046 
0.047 
0.047 
0D47 
0.047 
0.047 
0.048 

om 

0.048 

0.046 

0.048 

0.048 

0.048 

0.048 

0.048 

0.049 

0.049 

0.049 

0.049 

0.049 

0.049 

0.05 

0.813 

0.05 

0.05 

0.05 

0.05 

0.051 

0.051 
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129942 
119210 
101046 
114086 
110171 
101004 
129715 
101581 
113285 
127537 
100813 
101841 
135053 
101419 
119724 
102673 
129877 
114788 
123812 
117669 
123782 
102395 
133795 
123193 
132595 
104161 
115330 
112893 
133475 
128699 
102940 
131299 
102495 
129594 



126702 
124386 

130538 
114299 
115604 
106052 
131730 
131285 
129705 
123175 
103592 
118196 

104886 
104250 

113301 
110441 
125297 
135258 
130633 
112006 



Z41309 

R40037 

W81679 

AA482390 

R54798 

M17254 

AA426304 

KQ2405 

L41143 

T55087 

U95301 

R93340 

K01160 

Z38266 

H19964 

J04101 

N58479 

M34996 

T66830 

AA569531 

HG3995-HT4265 

M93107 

R77159 

M1788B 

W69468 

U72509 

AA248589 

AA156737 

AA620607 

N39237 

AA610111 

U41767 

M12529 

AA489228 

AA253369 

AA456471 

AA281145 

T08000 

L29217 

K03207 

X13956 

AA431464 

U51240 

R70379 



U54602 
N27368 

M20766 

Z40782 

AA400378 

AM16947 

U05681 

AA479498 

X78706 

AA489010 

Z30644 

N59478 

AA053348 
AF00Q575 

T67452 

H503Q2 

Z39215 

AA292423 

T92363 

R42607 



Hs.12400 

Hs£1506 

Hs.5174 

Hs^6510 

Hs.26239 

Hs.45514 

Hs.24174 

Hs.73933 

Hs.232069 



Hs.144442 
Hs.92995 

Hs.12770 

Hs.31709 

Hs£48109 

Hs.12126 

Hs.198253 

Hs.182712 

Hs.162859 

Hs.76893 
Hs.93678 
Hs.177592 
Hs.47622 

HS.13094 

Ks.103904 

Hs.1 11591 

Hs.44977 

Hs. 162695 

Hs.92208 

Hs.169401 

Hs.136956 

Hs.155742 

Hs.7724 

Hs.88827 

Hs.194664 

Hs.73987 

Hs.103972 

Hs-24998 

Hs.25426 

Hs.79356 

Hs.1 15396 

Hs.207689 

HS-2785 

H8212414 

Hs.159509 

H322920 

Hs.49391 

Hs.6382 

Hs.31210 

Hs.25274 

Hs.12068 

HS.178400 

Hs.123059 

Hs.48396 

Hs.144626 
Hs.105928 

Hs.13104 

Hs.19845 

Hs.159409 

Hs.97272 

Hs.178703 

HS22241 



ESTs 0.051 

ESTs 0.052 

ribosomal protein S17 0.052 

ESTs; Modly smlr to vacuolar prol sorting homolog r-vps33b [R,norvegicus] 0.052 

ESTs 0.052 

v-ets avian erythroblastosis virus E26 oncogene related 0.052 

ESTs 0.052 

Human MHC class II HLA-DQ-beta mRNA (OR7 DQw2); complete cds 0.052 

T-cell leukemia translocation altered gene 0.053 
yb45c08 Jl Stratagane fetal spleen (#937205) Homo sapiens cDNA 

done IMAQE:74126 5', mRNA sequence. 0.053 

phospholipase A2; group X 0.053 

ESTs 0.053 

Accession not listed in Genbank 0.053 

Homo sapiens PAC clone DJ0777O23 from 7p14-p15 0.053 

ESTs 0.053 

v-ets avian erythroblastosis virus E26 oncogene homolog 1 0.053 

ESTs; Weakly similar to LR8 [Ksapiens] 0.053 

major histocompatibility complex; class II; DQ alpha 1 0.053 

ESTs 0.053 

ESTs 0.054 

Cpg-Enrlched Dna, Clone S19 0.054 

3-hydroxybutyrate dehydrogenase (heart; mitochondrial) 0.054 

ESTs 0.054 

ribosomal protein; large; PI 0.054 

ESTs 0.055 

Human alternatively spliced 88 (B7) mRNA, partial sequence 0.055 

ESTs; Weakly similar to ORF YGR101 w [Sxerevisiae] 0.055 

EST 0.055 

ESTs 0.055 

ESTs 0.055 

EST 0.055 

a disintegrtn and metalloproteinase domain 15 (metargidin) 0.055 

apolipoprote/nE 0.055 

ESTs 0.056 

glyoxylate reductase/hydroxypyruvate reductase 0.056 

KIAA0963 protein 0.056 

ESTs 0.056 

bassoon (presynaptic cytomatrix protein) 0.056 

CDC-fike kinase 3 0.056 

praline-rich protein BstNl subfamily 4 0.056 

Hu 12S RNA Induced by poly(rl); po!y(rC) and Newcastle disease virus 0.056 

ESTs; Weakly similar to unknown [H^apiens] 0,057 

lysosomal-associated muffispanning membrane proteIn-5 0.057 

Human germline IgD chain gene; C-region; Odelta-1 domain 0.057 

EST 0.057 

kerafin17 0.057 
sema domain; immunogtobutin domain (lg); short basic domain; 

secreted; (semaphorin) 3E 0.057 

aIpha-2-plasmlnlnhibKor 0.057 

similar to S68401 (cattle) glucose Induced gene 0.057 

ESTs 0.057 

ESTs; Highly similar to KIAA0612 protein [Ksapiens] 0.057 

B-cellCLl7lymphoma3 0.057 

ESTs; Modly smlr to putative seven pass transmembrane prot [H.sapiens] 0.058 

carnitine acetyltransferase 0.058 

ESTs 0.058 

chloride channel Kb 0.058 
ESTs; Moderately similar to tumor necrosis factor-alpha 

-Induced protein 812 [H.sapiens] 0.058 

growth differentiation factor 11 0.058 
leukocyte immunoglobulln-like receptor; subfamily B (with TM 

and ITIM domains); member 3 0.058 

EST 0.058 

ESTs; Highly smlr to prot phosphatase 2A BR gamma subunit [Haptens] 0.058 

ESTs 0.058 

ESTs; Weakly similar to dJ281H8.2 [H.sapiens] 0.058 

ESTs 0.058 
hypothetical protein 0.058 
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130805 
134907 
132619 
135115 
100531 
124530 
119960 
132793 
101076 
130655 
134458 
105904 
132878 
121828 
133418 
129317 
130153 
124403 
127683 
129814 
131770 
117557 
103522 
120029 
102135 
123617 
112136 
133725 
102069 
106555 
123289 
109088 
129399 
129375 
135271 
132958 
129384 
123427 
105236 
101012 
134791 
133700 
123887 
129363 
105719 
124226 
117437 

132741 
134437 
107664 
120844 
101574 
131219 
103495 
129607 
106467 
128841 
100515 
119332 
134516 
135012 
103575 
115514 

103996 
110505 
133912 
129581 



U12194 
D80002 
M4Q4565 
N35489 

HQ1872-HT1907 

N62256 

W87533 

AA478999 

L04270 

N92934 

AA192614 

AA401452 

AA026793 

AA425166 

U76366 

N46244 

D85815 

N31745 

AA668123 

W20070 

D59682 

N33920 

Y10514 

W91960 

U15460 

AA609183 

R46100 

V00563 

U09196 

AA455000 

AA491226 

AA166837 

AA263028 

W79850 

AA397763 

W90398 

AA477106 

AA598548 

AA219179 

J04444 

L18983 

K01396 

AA621065 

H05704 

AA291644 

H62396 

N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C00476 

Y09022 

AA404594 

AA450040 

T16358 

HG1723-HT1729 

T54095 

AA171939 

X73608 

Z26256 

AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
Hs.178292 
Hs.53447 
Hs.94653 

Hs.102727 
Hs.32699 



Hs.1116 

Hs.17409 

Hs.83577 

Hs.32060 

Hs.58679 

Hs.98497 

Hs.172727 

Hs.1 10373 

Hs.15114 

Hs.102493 

Hs.134170 

Hs.168625 

Hs.31833 

Hs.44532 

HS250640 

Hs.41691 

Hs.181131 

Hs.9739 

Hs.1 79543 

Hs.82520 

Hs.16725 

Hs.105280 

Hs.72620 

Hs.111076 

Hs.1 1081 

Hs.97562 

Hs.6147 

Hs.110757 

Hs.1 12471 

Hs.19105 

Hs.697 

Hs.89655 

Hs.75621 

Hs.112943 

Hs.1 10746 

Hs.36793 

Hs.190266 



Hs.55898 

Hs.198253 

Hs.5326 

Hs.96917 

Hs.158029 

Hs.24395 

Hs.153591 

Hs.11607 

Hs.154162 

Hs.106443 



Hs.23413 
Hs.93029 

Hs.55609 



Hs.20495 
Hs.77522 
Hs.180255 



sodium channel; voltage-gated; type I; beta polypeptide 0.058 

KIAA0180 protein 0.058 

ESTs; Moderately similar to klnesin light chain 1 [M.muscutus] 0.058 

neurochondrin 0.058 

Major Histocompatibility Complex, Dg 0.058 

EST 0.058 

ESTs; Moderately similar to LIV-1 protein [H.saptens] 0.058 

KIAA0906 protein 0.058 

lymphotoxln beta receptor (TNFR superfamfly; member 3 0.058 

cysteine-rich protein 1 (intestinal) 0.058 

cysteine and glycine-rich protein 3 (cardiac LIM protein) 0.058 

ESTs 0.059 

ESTs; Weakly similar to 4F2/CD98 light chain [M.musculus] 0.059 

ESTs 0.059 

Treacher ColGns-Franoeschetli syndrome 1 0.059 

ESTs 0.059 

ras homolog gene family; member D 0.059 

ESTs 0.059 

ESTs 0.059 

KIAA0979 protein 0.059 

ESTs 0.06 

dlublquitin 0.06 

H.saplens mRNA for CD152 protein ~ Oj06 

sequence-specie single-stranded-DNA-blnding protein 0.06 

activating transcription factor B 0.06 

ESTs 0.06 

ESTs 0.061 

Immunoglobulin mu 0.061 

Hu 1 .1 kb mRNA upregttd in retinoic acid treated HL-60 neutrophilic cells 0.061 

ESTs 0.061 

ESTs; Weakly similar to d J963K235 [H.saptens] 0.061 

DKFZP434I114 protein 0.061 

matate dehydrogenase 2; NAD (mitochondrial) 0.061 

ESTs; Weakly similar to HPBRIl-7 protein [Haptens] 0.061 

ESTs 0.061 

KIAA1075 protein 0.061 

DNA segment on chromosome 21 (unique) 2056 expressed sequence 0.061 

ESTs 0.061 

translocate of inner mitochondrial membrane 17 (yeast) homolog B 0.061 

cytochrome c-1 0.062 

protein tyrosine phosphatase; receptor type; N 0.062 

protease inhibitor 1 (anti-elastase); alpha-1 -antitrypsin 0.062 

ESTs 0.062 

H sapiens HCR (a-helix coiled-coil rod homolog lie) mRNA; complete cds 0.062 

ESTs 0.062 

ESTs 0.062 
yw5e3.s1 Weizmann Olfactory Epithelium H sapiens cONA clone 

IMAGE-.255676 3" smtr to contains L1.t3 L1 repetitive element ;, mRNA seq 0.062 

ESTs; Highly similar to OASIS protein [M.musculus] 0.062 

major histocompatibility complex; class II; DQ alpha 1 0.062 

ESTs; Moderately similar to pim-1 protein [H^apiens] 0.062 

ESTs 0.062 

protein kinase; cAMP-dependent; catalytic; gamma 0.062 

smail inducible cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 0.062 

Not56 (D. melanogasterHace protein 0.062 

ESTs 0.062 

ADP-ribosylatlon factor-Bke 2 0.062 

ESTs 0.062 

Macrophage Scavenger Receptor, All Splice 2 0.062 
ESTs; Weakly similar to I! ALU SUBFAMILY J WARNING ENTRY II (H^apiens) 0.062 

ESTs 0.062 

sparc/osteonedin; ewev and kazaMike domains proteoglycan (testican) 0.063 

H.saplens isoform 1 gene for L-type calcium channel, exon 1 0.063 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 

CYTOPLASMIC (H.saplens] 0.063 

EST2393 Bone marrow Homo sapiens cONA 5' end, mRNA sequence 0.063 

OKFZP434F011 protein 0.063 

major histoconipaMty complex; class II; DM alpha 0.063 

major histocompatibility complex; dass II; DR beta 1 0.063 
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130139 


R38280 


Hs.150922 


105817 


AA397825 


Hs.5307 


134658 


AA410617 


Hs.178009 


100306 


D50495 


Hs.80598 


100277 


D42053 


HSJ5890 


133116 


D81259 


Hs.6529 


134909 


AA521488 


Hs.90998 


130319 


X74794 


Hs.154443 


132057 


AA102469 


Hs.173484 


108334 


AA070473 




129763 


F10815 


Hs.12373 


135112 


T67464 


Hs.94617 


122269 


AA436656 


Hs.98910 


133082 


AA457129 


Hs.6455 


113213 


T58607 




106228 


AA429290 


Hs.17719 


130192 


Y12661 


Hs.171014 


104694 


AA054087 


Ks.18858 


103508 


Y10141 




126474 


U40671 


Hs.100299 


134012 


AM17821 


Hs.237924 


134536 


AA457735 


Hs.850 


111714 


R23146 


Hs.23466 


110521 


H57060 


Hs.103268 


103282 


' X80198 


Hs.77628 


113921 


W80730 


Hs.28355 


129331 


N93465 


Hs.1 10453 


111316 


N74597 


Hs.180535 


135138 


AA036794 


Hs.95196 


107289 


T10792 


Hs.1 72098 


121405 


AA406083 


Hs.98007 


124965 


T16275 


Hs.106359 


106595 


AA456933 


Hs.174481 


100106 


AF015910 




134715 


AA262757 


Hs.89040 


135367 


AA480109 


Hs.9963 


111533 


R08548 


Hs^51651 


128509 


R53109 


H&247362 


101030 


J05037 


Hs.76751 


102753 


UB0226 




126991 


R31652 


Hs.821 


109583 


(=02322 


HSJ26135 


119241 


T12559 


Hs.221382 


130569 


AA156597 


Hs.256441 


112926 


T10316 


Lie /(OfVO 


120495 


AA256073 


Hs.190626 


130931 


AA278412 


Hs.21346 


129982 


M87789 


Hs.140 


133832 


H03387 


Hs.241305 


110697 


H93721 


Hs.20798 


121183 


AA400138 


Hs.97703 


130953 


U12707 


Hs.2157 


102218 


U24183 


Hs.75160 


114181 


Z39079 


Hs.8021 


116581 


D51287 


Hs.82148 


132498 


T87708 


Hs.50098 


103788 


AA096014 


Hs.9527 


102459 


U48936 




100373 


079999 


Hs.77225 


132717 


AA203321 


Hs.151696 


128863 


D87462 


Hs.106674 


115193 


AA262029 


Hs.88218 


124558 


N66046 


Hs.141605 


117225 


N20392 


Hs.42846 


110665 


H83380 


Hs,32757 



BCS1 (yeast homologHke 0.064 

synaptopodln 0.064 

ESTs 0.064 

transcription elongation factor A (Sll); 2 0.064 
site-1 protease (subtilisln-Gke; sterol-regulated; cleaves sterol regulatory 

element binding proteins) 0.064 

ESTs 0.064 

KIAA0128 protein 0.064 

mtnlchromosome maintenance deficient (S. cerevlslae) 4 0.064 

ESTs 0.064 
zm7c8.s1 Stratagene neuroepithelium (#937231 ) Homo sapiens cDNA 

done IMAGE-5399 3*, mRNA sequence 0.064 

KIAA0422 protein 0.064. 

ESTs; Weakly similar to predicted using Geneffrtder [C.etegans] 0.064 

ESTs 0.064 

RuvB(EcollhomologHike2 0.064 
ya94a02.s1 Stratagene placenta (#937225) Homo sapiens cDNA clone 

IMAGE:6929Q 3\ mRNA sequence. 0.065 

ESTs 0.065 

VGF news growth factor Inducible 0.065 

phospholipase A2; group IVC (cytosoBc; catctum-independent) 0.065 

H^apiens DAT1 gene, partial, VNTR 0.065 

itgaselll;DNA;ATP-dependent - 0X365 

ESTs; Highly similar to CG1-69 protein [Haptens] 0.065 

IMP (inostne monophosphate) dehydrogenase 1 0.065 

ESTs 0.065 

ESTs 0.065 

steroidogenic acute regulatory protein related 0.065 

ESTs 0.065 

ESTs; Highly similar to CGI-38 protein (H.saplens] 0.065 

ESTs; Weakly similar to mitogen inducible gene mig-2 [H^apiens] 0.065 

ESTs; Weakly similar to T20B12.3 [Celegans] 0.065 

ESTs 0.065 

ESTs 0,065 

ESTS 0.065 

ESTs 0.066 

Homo sapiens unknown protein mRNA, partial cds 0.066 

prepronociceptin 0.066 

TYRO protein tyrosine kinase binding protein 0.066 

EST 0.066 

dimethylargfnine dtmethylamlnohydrolase 2 0.066 

serine dehydratase 0.066 

Human gamma-aminobutyric acid transaminase mRNA, partial cds 0.067 

blglycan 0.067 

ESTs 0XJ67 

ESTs 0.067 

EST; Moderately similar to CGI-136 protein [Usapiens] 0.067 

ESTs 0.067 

ESTs 0.067 

ESTs; Weakly similar to F42C5.7 gene product [C.etegans] 0.067 

Immunoglobulin gamma 3 (Gm marker) 0.067 

estrogen-responsive B box protein 0.067 

ESTs . 0.067 

ESTs 0.067 

Wiskott-Aldrich syndrome (ecezema-thrombocytopenla) 0.067 

phosphofructokinase; muscle 0.067 

KIAA1058 protein 0.067 
ribosomal protein S1 2 0.067 
ESTs 0.068 
ESTs; Highly similar to HSPC013 [Rsaplens] 0.06B 
Human amiloride-sensttive epithelial sodium channel gamma subunit mRNA, 
5' end, partial cds 0.068 
ADP-ribosyltransfeiase (NAD+ ; poly (ADP-ribose) polymerase)-iike 1 0.068 
DKFZP727G051 protein 0.06B 
BRCA1 associatad protein-1 {ubfquitin carboxy-termlnal hydrolase) 0.068 
ESTs 0.068 
ESTs 0.069 
ESTs 0,069 
ESTs 0.069 
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132905 


U70663 


Hs.182865 


105778 


AA348910 


Ms.153299 


134770 


R72079 


Hs.69575 


123097 


AA485869 


Hs.105671 


100750 


HG3523-HT4899 




125091 


T91518 




100756 


HQ3565-HT3768 




113483 


T87768 


Hs.16439 


101119 


L09708 


Hs.2253 


102286 


U31628 


Hs.12503 


135349 


D83174 


Hs.9930 






Ue DOAOC 


133675 


AA443720 


Hs.7551 


105422 


AA251014 


Hs.12210 


102932 


X13334 


Hs.75627 


119147 


R58878 


Hs.65739 


104900 


AA055048 


Hs.180481 


133185 


M481404 


Hs.6666 


115496 


AA290674 


Hs.71819 


121005 


AA398332 


Hs.97613 


124869 


R69088 


H8J28728 


129154 


N23673 


Ks.108969 


112161 


R48295 




125251 


W87486 


Hs.141464 


134298 


J00116 


Hs.81343 


119745 


W70264 


Hs.58093 


131306 


AA232686 


Hs^5489 


107776 


AA018820 


Hs^21147 


134271 


AA199630 


Hs.184456 


101798 


M85220 




135402 


S76942 


Ms.99922 


118742 


N74052 


Hs£0424 


131867 


N64658 


Hs.3353 


102923 


X12517 


Hs.1063 


100775 


HG371-HT26388 




111020 


N54361 


Hs.165726 


134224 


X80822 


Hs.163593 


124059 


F13673 


Hs.99769 


133972 


AA160743 


Hs.78019 


129681 


AM36009 


Hs.178186 


103065 


X58399 


Hs.81221 


124966 


T19271 


H&155560 


112270 


R53021 


Hs.203358 


116704 


F10183 


Hs.66140 


129890 


M13699 


Hs.1 11461 


127345 


AA972008 


Hs.166253 


112436 


R63Q90 


HS28391 


114531 


AA053033 


HS203330 


135122 


H99030 


Hs.94814 


103934 


AA281338 


Hs.134200 


109383 


AA215369 


Hs.185764 


112647 


R33329 


Hs.33403 


127083 


Z44079 


Hs.91608 


133027 


AM02624 


Hs.63236 


122086 


AA432121 


Hs.250986 


110405 


H47542 


Hs.33982 


128697 


AB002344 


Hs.103915 


112221 


R50380 


Hs.25670 


100478 


HG1067-HT1087 




115598 


AA400129 


Hs.65735 


132491 


AA227137 


Hs.4934 


101655 


M60299 




106018 


AM1 1887 


Hs.34737 


129683 


W05348 


Hs.158196 


134137 


F10045 


Hs.79347 


114008 


W89128 


Hs.19872 



KruppeWlke factor 4 (gut) 

DOM-3 (C. elegans) homolog Z 

C079B antigen (tmmunogtobultn-associated beta) 

ESTs 

Proto-Oncogene C-Myc AIL Splice 3, Orf 1 14 

ye20f05.s1 Stratagene king (#937210) H sapiens cONA done IMAGE: 

3' simitar to contains Alu repetitive element;contains MER12 repetitive element; 

mRNA sequence. 

Zinc Finger Protein (Gb:M88357) 

ESTs 

complement component 2 
Interleukin 15 receptor; alpha 
collagen-binding protein 2 (coliigen 2) 
plasminogen activator inhibitor; type I 
ESTs; Weakly similar to T25G3.1 [C.elegans] 
ESTs 

C0 14 antigen 
ESTs 

ESTs; Weakfy similar to ACROSIN PRECURSOR [Usaplens] 
ESTs 

eukaryotic translation initiation factor 4E binding protein 1 
ESTs 

ESTs; Weakly similar to F55A12.9 [C.elegans] * 
mannosidase; alpha; class 2B; member 1 

ESTs; WMy smlr to H ALU SUBFAMILY J WARNING ENTRY l! [H^apiens] 
ESTs 

collagen; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 

dysplasia; congenital) 

ESTs 

ESTs 

ESTs 

ESTs; Wkry smlr to 11 ALU SUBFAMILY SX WARNING ENTRY H [H^apiens] 
Accession not Bsted In Genbank 
dopamine receptor D4 
EST 

Homo sapiens done 24940 mRNA sequence 
small nuclear ribonucleoproteln polypeptide C 
Mucin 1, Epithelial, AIL Splice 9 
ESTs 

ribosomal protein L18a 
ESTs 

Homo sapiens clone 24432 mRNA sequence 

ESTs; Weakly similar to WASP-famity protein [H^apiens] 

Human L2-9 transcript of unrearranged immunoglobulin V(H)5 pseudogene 

cainexln 

ESTs 

EST 

cerutoplasmln (ferroxidase) 

ESTs; Highly similar to KIAA0476 protein (Haptens] 

ESTs 

ESTs 

ESTs 

Homo sapiens mRNA; cONA DKFZp564C188 (from clone DKFZp564C186) 

ESTs; Weakly simitar to hypothetical protein [H .sapiens] 

ESTs 

otoferiin 

synudeln; gamma (breast cancer-specific protein 1 ) 

EST 

ESTs 

KIAA0346 protein 
ESTs 

Mucin (Gb:M22406) 
ESTs 

KIAA0828 protein 

Human alpha- 1 collagen type ll gene, exons 1, 2 and 3 
ESTs 

DKFZP434B103 protein 
KIAA0211 gene product 
ESTs 



0.069 
0.069 
0.069 
0.069 
0.069 



0.069 
0.069 
0.069 



0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.07 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.071 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.072 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 

0.073 
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107853 AA010210 Ks.47041 

104798 AA029462 Hs.17235 

134082 L16991 Hs.79006 

119180 R80413 Hs.92520 

5 107741 AA016982 Hs.64341 

133683 AA335223 Hs.75558 

111694 R22035 Hs.23331 

120764 AA338729 Hs.133098 

119389 T88826 Hs.90973 

10 100929 HG688-HT688 

119388 T88798 

133019 AF009674 Hs.184434 

105185 AA191495 Hs. 169937 

133413 S72043 Hs.73133 

15 101017 J04599 Hs.821 

132865 K02765 Hs.251972 

110882 N36001 Hs.17348 

129197 T90303 Hs.109308 

101184 L19871 Hs.480 

20 134910 AA431320 Hs.9100 

119411 T96621 Hs.203656 

102000 U01824 Ms.380 

114691 AA121893 Hs.103779 

134179 U53204 Hs.79706 

25 134503 U34880 Hs.84183 

129719 N66396 Hs.167766 

113916 W80464 Hs.31928 

113897 W73926 Hs.4947 

30 129697 R00841 Hs.172069 

112078 R44155 Hs.112218 

121980 AA429886 H$.1 10407 

100898 HG4638-HT5050 

121626 AM16974 Hs.98174 

35 133870 AA243416 Hs.75470 

131879 AA017161 Hs.33792 

100254 D38037 Hs.77643 

133194 AA291726 Hs.67201 

106081 AM18394 Hs.25354 

40 115544 AA351433 Hs.66187 

119955 W87460 Hs.58989 

104407 H61361 Hs.102171 

135019 X58431 Hs£8428 

114815 AA161488 Hs.103931 

45 119471 W31352 Hs.55445 

117788 N48292 Hs.46849 

119406 T95064 Hs.193771 

130777 R61742 Hs256554 

130494 L13197 Hs.75874 

50 104107 AA424111 Hs.12598 

121483 AA411981 Hs£5274 

104451 M13299 Hs.102119 

118027 N52770 Hs.75968 

109419 AA227560 Hs.88987 

55 1 15783 AA424487 Hs.72289 

110585 H62223 Hs.1 33526 

123165 AA488863 Hs.1 05216 

103966 AA303166 Hs.127270 

109549 F01528 Hs£1192 

60 106730 AA465520 Hs22313 

120310 AA193676 Hs.1 18926 

104078 AA4O2801 Hs.222010 

117624 N35978 Hs.82364 

112421 R62441 Hs.23127 

65 106958 AA497026 Hs.22059 

129984 W92811 Ms.183927 

122044 AA431456 Hs.98736 

123280 AA491285 Hs.175144 

115710 AM12535 Hs.55235 



ESTs 0.073 

ESTs 0.073 

deoxythymidytate kinase 0.073 

ESTs 0.073 

ESTs 0.073 

pepsinogen 5; group I (pepsinogen A) 0.073 

ESTs 0.073 

ESTs 0.073 

ESTs 0.074 

Major Histocompatibility Complex, Class II, Dr Beta 2 (Gb:X65561) 0.074 

plasminogen activator inhibitor; type I 0.074 

axin 0.074 

ESTs 0.074 

metallothloneln 3 (growth inhibitory factor (neurotrophic)) 0.074 

blglycan 0.074 

complement component 3 0.074 

ESTs; Wkly smlr to H ALU SUBFAMILY SQ WARNING ENTRY !! [Ksaplens] 0.074 

ESTs; Wkfy smlr to leucine-rlch gJioma-inactivated prot precursor [H.sapiens] 0.074 

activating transcription factor 3 0.075 

ESTs 0.075 

EST 0.075 

solute carrier family 1 (glial high affinity glutamate transporter); member 2 0.075 

ESTs; Weakly simitar to envelope protein [H.sapfens] 0.075 

plectin 1 ; intermediate filament binding protein; 500kD 0.075 
dlptheria toxin resistance protein required for diphthamide 

biosynthesis (Saccharornyces)-(Ike 1 0.075 

ESTs; Moderately similar to Pro-a2(XI) [Ksaplens] 0.075 

ESTs; Wkly smlr to aitemativety spliced product using exon 13A[fisapisns] 0.075 

ESTs 0.075 

DKFZP434C212 protein 0.075 

ESTs 0.075 
ESTs; Weakly similar to coded for by C. elegans cONA yk173c12.5 [C.elegans] 0,075 

Spliceosomal Protein Sap 49 0.075 

ESTs 0.075 

hypothetical protein; expressed in osteoblast 0.075 

ESTs 0.075 

FKSOfrblnding protein 1B (12.6 kD) 0.075 

ESTs 0.075 

ESTs 0.075 

Homo sapiens clone 23700 mRNA sequence 0.076 

ESTs 0.076 

immunoglobulin superfamity containing teucine-rich repeat 0.076 

Human Hox2.2 gene for a homeobox protein 0.076 

DKFZP434B0335 protein 0.076 

ESTs 0.076 

ESTs 0.076 

EST 0.076 

ESTs 0.076 

pregnancy-associated plasma protein A 0.076 

T-ceil lymphoma Invasion and metastasis 2 0.076 

ESTs; Modly smlr to putative seven pass transmembrane prot [H.sapiens] 0.076 

blue cone pigment 0.076 

thymosin; beta 4; X chromosome 0.076 

receptor-Interacting serine-threonine kinase 3 0.076 

ESTs; Weakly similar to UV-1 protein [Haptens] 0.076 

ESTs; Wkly smlr to !!!ALU SUBFAMILY SB1 WARNING ENTRY III[H.sapiens] 0.076 

ESTs; Weakly smlr to IIALU SUBFAMILY J WARNING ENTRY H [H.saplens] 0.077 

ESTs 0.077 

Homo sapiens clone 25155 mRNA sequence 0.077 

ESTs 0.077 

DKFZP586K0919 protein 0.077 

ESTs 0.077 

ESTs 0.077 

ESTS 0.077 

ESTs 0.077 
ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY !! (Haptens] 0.077 

EST 0.077 

ESTs 0.077 
sphingomyelin phosphodiesterase 2; neutra 
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134129 D87444 Hs.79305 

129321 AA224502 Hs.206501 

130513 AA460257 Hs.15866 

5 100996 J03909 Hs.14623 

128358 AI09571B Hs.135015 

128544 R59352 Hs.1 19273 

106040 AA412681 Hs.125139 

106495 AA452113 Hs.32454 

10 131833 R40899 Hs.32973 

119219 R97176 Hs.1 10783 

135415 X60855 Hs.99967 

109457 AA232646 Hs.68061 

117137 H96670 Hs.42221 

15 107094 AA609814 Hs£241 

130165 T90529 H&251613 

124072 H05252 Hs.101637 

126151 AA324743 Hs.40808 

119035 R01779 Hs.7740 

20 110157 H18987 Hs.169731 

128515 AA149044 Hs.10086 

133069 US4836 Hs.6430 

112209 R49644 Hs£4865 

133361 R28279 Hs.71848 

25 134714 U89922 Hs.890 

129905 T86796 Hs.1 32875 

120421 AA236166 Hs.132957 

100885 HG4490-HT4876 

102789 U86759 Hs.158336 

30 120139 Z39273 Hs.77876 

135238 U76343 Hs.96970 

129618 N54845 Hs.173030 

132960 AA609742 Hs.6150 

108751 AA127063 Hs.203717 

35 134060 042039 Hs.78871 

111338 N79778 Hs.35094 

112345 R5B880 Hs.26563 

126456 W00881 

40 128937 239939 Hs.10726 

103485 Y08409 Hs.248415 

111202 N68280 Hs.107922 

132625 AA429890 Hs.1 66066 

103434 X98085 Hs.54433 

45 102616 U65581 Hs.159191 

102667 U70867 Hs.83974 

111422 R01127 Hs.19104 

101411 M16938 Hs.820 

113267 T65058 Hs.12725 

50 103559 Z19585 Hs.75774 

131588 AA258613 Hs.29189 

107B21 AA020991 Hs.172856 

134278 H82839 Hs.81001 

120893 AA369800 Hs.97058 

55 108786 AA128999 

106890 AA489245 Hs.88500 

119760 W72267 Hs,58219 

132999 Y00787 Hs.624 

60 129156 AA028195 Hs.108973 

121171 AA400008 Hs.161814 

103864 AA207264 Hs.181077 

128591 AA255537 Hs.102057 

122172 AA435753 Hs.161854 

65 112802 R97647 Hs.174855 

107723 AA015967 Hs.60680 

113011 T23737 Hs.1600 

131279 AA089853 Hs.25197 

103190 X70083 Hs.58414 



I membrane (neutral sphingomyelinase) 0.077 

KIAA0255 gene product 0.077 

Homo sapiens clone 643 unknown mRNA; complete sequence 0.078 

ESTs 0.078 

Interferon; gamma-Induclbte protein 30 0.078 

ESTs 0.078 

KIAA0296 gene product 0.078 

ESTs 0.078 

ESTs; Moderately similar to KIAA0544 protein [Ksapiens] 0.078 

glycine receptor; beta 0.078 

ESTs 0.078 

even-skipped homeo box 1 (homotog of Drosophila) 0.078 

ESTs; Weakfy similar to sphlngoslne kinase [M.musculus] 0.078 

ESTs 0.078 

ESTs 0.078 

EST 0.078 

EST; Weakly similar to hypothetical protein [Ksapiens] 0.078 

ESTs 0.078 

ESTs 4 0.078 

ESTs 0.078 

ESTs; Highly similar to HYPOTHETICAL PROTEIN KIAA0195 [Ksapiens] 0.078 

protein with polyglutamine repeat 0X78 

ESTs * 0.078 

Human clone 23548 mRNA sequence 0X78 

lymphotoxfn beta (TNF superfamiry; member 3) 0.078 

ESTs; Weakry similar to predicted using Genefinder [Celegans] 0.079 

ESTs; Weakly similar to chondromodufin-l precursor [H^apiens] 0.079 

Proiine-Rlch Protein Prb4, Allele 0.079 

netrin2(chicken)-like 4 0.079 
Human DNA from chromosome 19-specirlccosmid R30923; genomic sequence 0.079 

Human liver GABA transport protein mRNA; 3' end 0.079 

ESTs 0.079 

KIAA0521 protein 0.079 

ESTs 0.079 

KIAA0081 protein 0.079 

extracellular matrix protein 2; female organ and adipocyte specific 0.079 

ESTs 0.079 
za56d02j1 Scares feta) liver spleen 1NFLS Homo sapiens cDNA clone 

IMAGE296547 5', mRNA sequence. 0.079 

ESTs 0.079 

thyroid hormone responsive SP0T14 (rat) homotog 0.079 

ESTs 0.079 

cisptatin resistance associated 0.079 

tenasdn R (restrictin; janusin) 0.079 

nbosomal protein L3-fike 0.079 

solute carrier family 21 (prostaglandin transporter); member 2 0.079 

ESTs 0.079 

homeo box C8 0.08 
ESTs; Weakly simitar to!! ALU SUBFAMILY J WARNING ENTRY II [Ksapiens] 0.08 

thfombospondin 4 0.08 

KIAA1021 protein 0.08 

ESTs 0.08 

ESTs; Weakly similar to DY3.6 [Celegans] 0.08 

EST; Highly similar to CMP-N-acetylneuramlnic acid hydroxylase [Ksaplens] 0X8 
zo8f12.s1 Stratagene neuroeplthelium NT2RAMI 937234 Homo sapiens 

cDNA clone IMAGE:5671 19 3\ mRNA sequence 0.08 

K1AA1066 protein; JSAP1 homotog (mouse); JIP3 homotog (mouse) 0.08 

ESTs 0.08 

InterleukinS 0.08 

doiicrryt-phosphate mannosyltransferase polypeptide 2; regulatory subunit 0.08 

ESTs 0.08 

ESTs; Weakly similar to Miller-Dieker lissencephaly gene [H.sapiens] 0.08 

ESTs; Weakly similar to O-finked GlcNAc transferase [Haptens] 0.08 

EST 0.08 

EST 0.08 

EST 0.08 

chaperonln containing TCP1; subunit 5 (epsllon) 0.081 

STIP1 homology and U-Box containing protein 1 0.081 

fiiamin C; gamma (actin-binding protein-280) 0.081 
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103956 


AA292411 


Hs.233348 


112706 


R89828 


Hs.138493 


126126 


M85370 




130094 


H43266 


Hs.167017 


100800 


HG3945-HT4215 




108675 


AA1 15240 


Hs.61616 


129420 


AA234259 


Hs.99816 


129666 


M77349 


Hs.1 18787 


101646 


'M59807 


Hs.943 


130536 


T17045 


Hs.159492 


107732 


AA016181 


HS59752 


123071 


AA482593 


Hs.104265 


113537 


T90457 


Hs.191293 


101250 


L34060 


Hs.79133 


122521 


AA449433 


Hs.149227 


133914 


N32811 


Hs.77542 


102038 


U05659 


Hs.477 


110336 


H40338 


Hs.174094 


118637 


N70274 


Hs.49822 


117966 


N51589 


Hs.94012 


104424 


H87671 


H8.182320 


100361 


D78361 


H3.125078 


112974 


T17291 


Hs.101174 


132832 


063482 


Hs.57734 


132039 


Z39489 


Hs.3781 


113272 


T65383 


Hs.12807 


104924 


AA058532 


Hs.28774 


111061 


N58054 


Hs.38859 


129269 


R45977 


Hs.163593 


102453 


U48437 


Hs.74565 


126204 


A1080388 


Hs.134296 


116615 


D80666 


Hs.45203 


128856 


AA219552 


Hs.204144 


112776 


R95850 


Hs.34494 


105494 


AA256273 


Hs.29288 


1 1/UUU 


UQA71Q 

nwno 


ns.t i^ioo 


112656 


R85260 


Hs.133151 


128963 


J03890 


Hs.1074 


116957 


H79292 


Hs.39960 


1Q1U57 


ism/in 




121948 


AA429452 


Hs.98582 


'130822 


M80647 


Hs.2001 


122743 


AA458674 


Hs.99478 


114569 


AA063316 




132270 


U70671 


Hs.43509 


108126 


AA052951 


Hs.47413 


102880 


X04325 


Hs.2679 


115365 


AA282089 


Hs.88599 


114529 


AA052980 


Hs.206704 


135017 


AA249586 


Hs.9315 


123776 


AA610071 


Hs.1 12813 


114454 


AA021091 


Hs.226208 


101246 


L33799 


Hs.202097 


107366 


U78310 


Hs.13501 


132779 


T89601 


Hs.95497 


129709 


AA1 12209 


Hs.1209 


115244 


AA278767 


Hs.914 


123253 


AA490878 


Hs.111334 


128469 


T23724 


Hs.258677 


132220 


AA431847 


Hs.42409 


111664 


R17939 


Hs.22344 


102354 


U38268 




112828 


R98774 


H3.194338 



ESTs 0.081 
ESTs 0.081 
EST01884 Fetal brain, Stratagene (cat#936206) Homo sapiens cONA 
clone HFBCH10, mRNA sequence, 
gamma-amlnobutyrtcadd (GABA) B receptor; 1 
Phospholipid Transfer Protein 
ESTs 
ESTs 

transforming growth factor; beta-Induced; 68kD 
natural klHer cell transcript 4 
spastic ataxia of Charievotx-Saguenay (sacsln) 
ESTs 
ESTs 
ESTs 
cadherin8 

ESTs; Weakly similar to PROLINE-RICH PROTEIN MP-3 [M.muscutus] 
ESTs 

hydroxysteroid (17-beta) dehydrogenase 3 

ESTs; Weakly similar to I! ALU SUBFAMILY J WARNING ENTRY !! [H.sapiens] 
ESTs 
ESTs 

ESTs; Weakly simitar to Mouse 19.5 mRNA; complete cds [M.musculus] 
Human mRNA for ornithine decarboxylase antizyme; ORF 1 and ORF 2 
microtubule-associated protein tau 
KIAA0148 gene product 

Homo sapiens BAC done RG118D07 from 7q31 
ESTs 
ESTs 
ESTs 

ribosomal protein L18a 
amyloid beta (A4) precursor-like protein 1 
ESTs 
ESTs 

ESTs; Modly smlr to tumor necrosis factor-alpha-induced prot B12 [H.saplens] 
ESTs 

Homo sapiens mRNA; cDNA DKFZp434P174 (from done DKFZp434P174) 
ESTs; Weakly similar to repressor protein [H^apiens] 
transient receptor potential channel 7 
surfactant; pulmonary-associated protein C 
ESTs 

Human complement C1q B-chafo gene, exon A+1 
ESTs 

thromboxane A synthase 1 (platelet; cytochrome P450; subfamily V) 
EST 

zm2d1.s1 Stratagene corneal stroma (#937222) Homo sapiens cONActone 
IMAGE3129473 1 similar toTR:E198281 E198281 THJOREDOXIN 
REDUCTASE ;contalns Alu repetitive element;, mRNA sequence 
ataxin 2 related protein 
ESTs 

gap Junction protein; beta 1; 32kD (connexin 32; Ctiarcot-Marie-Tooth 
neuropathy; X-linked) 
ESTs 
ESTs 

ESTs; Weakly similar to NEURONAL OLFACTOMEDIN-RELATED 
ER LOCALIZED PROTEIN [H^aplens] 
ESTs 
ESTs 

procollagen C-endopeptidase enhancer 
pescadillo (zsbrafish) homolog 1; containing BRCT domain 
ESTs; Weakly similar to GLUCOSE TRANSPORTER TYPE 5; 
SMALL INTESTINE [Ksapiens] 0.083 
acyl-Coenzyme A dehydrogenase; long chain 0.083 
Human mRNA for SB dassll histocompatibility antigen alpha-chain 0.083 
ferritin; light polypeptide 0.083 
EST 0.083 
ESTs; Highly similar to CGM46 protein (Haptens] 0.083 
ESTs 0.083 
Human cytochrome b pseudogene, partial cds 0.084 
ESTs 0.084 



0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.081 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.082 
0.083 
0.083 
0.083 
0.083 
0.083 
0.083 



0.083 
0.083 



0.083 
0.083 
0.083 

0.083 
0.083 
0.083 
0.083 
0.083 
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110410 


H47868 


Hs.34024 


102620 


U66052 




102550 


U58087 


Hs.14541 


108417 


AA075716 




113299 


T67285 


Hs.13089 


117869 


N49947 


Hs.46990 


113734 


T98484 


Hs.18377 


133325 


C00424 


Hs.7101 


123368 


AA505022 


Hs.124838 


101615 


M55153 


Hs.8265 


119352 


T65972 


Hs.193365 


123828 


AA620686 


Hs.1 12884 


103611 


Z38133 


Hs.113973 


131289 


AA485697 


Hs.25334 


128678 


T1589S 


Hs.103535 


130814 


AA256695 


Hs.19813 


133391 


X57579 


Hs.727 


129322 


AA437153 


Hs.1 10407 


109284 


AA196995 


Hs.86092 


116689 


F09222 


Hs.66099 


100545 


HG2147-HT2217 




102634 


U66711 


Hs.77667 


111735 


R25389 


Hs.23856 


105181 


AA190875 


Hs.10974 


122681 


AA455350 


Hs.99401 


114543 


AA056121 


Hs.158419 


133597 


AA425903 


Hs.75139 


121064 


AA398647 


Hs.97406 


122231 


AA436369 


Hs.197728 


100309 


D50550 


Hs.95659 


101727 


M73481 


Hs.73883 


131226 


AA165400 


Hs.24476 


133580 


AA095041 


Hs.181073 


102792 


U87964 


Hs.227576 


104976 


AA086480 


Hs.183669 


120865 


AA350631 


Hs.96963 


106080 


AM18046 


Hs.35124 


128571 


M416619 


Hs.101681 


101838 


M92934 


H3.75511 


128514 


H84261 


Hs.100843 


123099 


AA485931 


Hs79 




TUO£\A/ 


Hs 78920 

I 19. I wvfcv 


116967 


H80336 


Hs.40124 


110053 


H12586 


Hs.89563 


114395 


AA007313 


Hs.110155 


107465 


W44681 


Hs.251385 


101983 


S85655 


Hs.75323 


112544 


R70948 


Hs.29153 


111423 


R01165 


H3.188507 


127918 


AA806043 


Hs.1 15396 


107300 


T40348 


Ks.90488 


134947 


R51194 




124579 


N68345 


Hs.1 27179 


130471 


Z68280 


Hs.183706 


116596 


D60755 


Ks.92955 


105069 


AA136345 


Hs.23817 


102491 


U51010 




130069 


AA055896 


Hs.146428 


130234 


AA280413 


H3.157441 


120540 


AA262992 


Hs.96417 


122508 


AA449221 


HS20432 



ESTs 

Human clone W2-6 mRNA from chromosome X 
cultinl 

zm89e5.s1 Stratagene ovarian cancer (#937219) H sapiens cDNA clone 

IMAGE54512 3* similar to gbX14723 CLUSTERIN PRECURSOR 

(HUMAN);, mRNA sequence 

ESTs 

ESTs 

EST 



ESTs 

transglutaminase 2 (C polypeptide; proteln-gtutamine 

•gamma-glutamyltransferase) 

ESTs; Moderately similar to alternatively spliced product 

using exon 13A [Hsapiens] 

EST 

myosin; heavy polypeptide 8; skeletal muscle; perinatal 

ESTs; Weakly similar to ION CHANNEL HOMOLOQ RIC 

PRECURSOR [M.rmjsculus] 

ESTs 

ESTs 

inhtbin; beta A (activin A; activin AB alpha polypeptide) 

ESTs; Weakly similar to coded for by C. elegansncONA yk173c12.5 [Celegans] 

ESTs 

ESTs 

Mucin 3, Intestinal (Gb:M55405) 

lymphocyte antigen 6 complex; locus E 

ESTs; Weakly similar to FAST kinase [Ksaplens] 

ESTs; Moderately similar to unknown [R Jiorvegicus] 

EST 

ESTs 

partner of RAC1 (arfaptin 2) 
ESTs 

ESTs; Weakly similar to ZINC FINGER PROTEIN 135 [Rsapiens] 

lethal giant larvae (Drosophila) homolog 1 

gastrin-reteasing peptide receptor 

ESTs 

ESTs 

GTP binding protein 1 

ESTs; Weakly similar to !! ALU SUBFAMILY J WARNING ENTRY II [Ksaptens] 

EST 

ESTs 

ESTs 

connective tissue growth factor 

ESTs; Weakly similar to similar to GTP-blnding protein [Celegans] 
amlnoacylase 1 

Rab geranylgeranyttransferase; alpha subunit 
EST 

nuclear cap binding protein 1 ; 80kD 
ESTs 

murine retrovirus Integration site 1 homolog 

prohibitin 

ESTs 

ESTs 

Human germllne IgD chain gene; Oreglon; C-delta-1 domain 
ESTs 

yJ71a08.r1 Soares breast 2NbHBst Homo sapiens cDNA clone 1MAGE:154166 

5" similar to gb:L1 1284 DUAL SPECIFICITY MITOGEN-ACTIVATED PROTEIN 

KINASE KINASE 1 (HUMAN);, mRNA sequence. 

ESTs; Weakly simitar to TERATOCARCINOMA-DERIVED GROWTH 

FACTOR 1 [H.sapiens] 

adducin 1 (alpha) 

ESTs 

ESTs; Weakly similar to ZFOC1 gene product [H.sapiens] 

Human nicotinamide N-methyttransferase gene, exon 1 and 5' flanking region 

collagen; type V; alpha 1 

spleen focus forming virus (SFFV) provlral integration oncogene spil 

ESTs 

ESTs 



0.084 
0.084 
0.084 



0.084 
0.084 
0.084 
0.084 
0.084 
0.084 

0.084 

0.084 
0.084 
0.084 

0.084 
0.084 
0.084 
0.084 
0.084 
0.084 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0X85 
0.065 
0.085 
0.085 
0.085 
0.086 
0.086 
0.086 



0.088 

0.088 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
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128054 
133020 
130056 
130504 
133978 
105265 
133035 
100768 
129338 
132789 
116099 
100721 
112569 
130645 
100751 
134550 
130885 
101446 
116287 
134034 
130860 
109901 
107537 

133232 
108559 

121288 
108844 
129874 
105139 
124789 
115923 
123640 
131607 
130064 
108752 
124249 
100109 
104642 
131752 
114727 
120965 
100396 
106218 
111562 
121219 
101187 
101513 
116454 
116171 
117500 
119978 
132005 
109914 
130370 
104262 
129703 
106398 
120884 
130404 
114072 
131470 
124573 
114717 
133806 
130470 
133182 
116036 



AI205718 

AA053248 

AA017356 

U48865 

W73859 

AA227941 

T15965 

HG3636-HT3848 

T56800 

W23761 

AA456309 

HG3355-HT3532 

R73150 

AA020942 

HG3527-HT3721 

M27161 

AA338646 

M21302 

AA487856 

X89267 

U66061 

H04992 

Z20777 

AA496030 
AA085161 

AA401735 

AA132916 

AA406486 

AA164543 

R43803 

AA441929 

AA609292 

AA351409 

T67053 

AA127070 

H68077 

AJ000480 

AA004662 

AA453311 

AA132545 



084361 

AA428451 

R09567 

AA400606 

120316 

M28210 

AA621071 

AA463434 

N31909 

W88623 

D58231 

H05529 

M55265 

AF009801 

AA417181 

AA447545 

AA3S5356 

X72012 

Z38184 

X54938 

N67935 

AA131240 

M12759 

AA398552 

280787 

AA452572 



Hs.125416 

Hs.185182 

Hs.171900 

Hs.158323 

Hs.78061 

Hs£6088 

Hs.6333 

Hs.47274 
Hs.66876 
Hs.58831 

Hs.75270 
Hs.1 7200 

Hs.85258 

H&20912 

Hs.56306 

Hs.155829 

Hs.78601 

Hs241395 

H&30499 

Hs.9857 

Hs.6845 



Hs.97340 

Hs.177961 

Hs.181551 

Hs.1 10082 

Hs.78110 

Hs.38205 

Hs.1 12681 

Hs.172740 

Hs.181125 

Hs.71055 

Hs.1 08211 

Hs.143513 

HS.1 84245 

HSJ31566 

HS.190202 

Hs.179715 

Hs.151123 

H&91146 

Hs.187569 

Hs.144344 

Hs£08 

H&27744 

Hs.42034 

Hs.42658 

H&44278 

H&59190 

Hs.173091 

Hs.194704 

Hs.155140 

Hs.1 05941 

Hs.120858 

Hs.18268 

Hs.97041 

Hs.76753 

H 5,1 23633 

Hs.2722 

Hs.194703 

H&252014 

HS.76325 

Hs.15711 

Hs.240135 

Hs.43866 



ESTs 

ESTs; Highly similar to 40S RIBOSOMAL PROTEIN S1 0 [H.saplens] 



CCAAT/enhancer binding protein (C/EBP); epsOon 
transcription (actor 21 
ESTs 
ESTs 

Myosin, Heavy Polypeptide 9, Non-Muscle 

Homo sapiens mRNA; cONA DKFZp564B176 (from clone DKFZp564B176) 
ESTs 

regulator of Fas-induced apoptosls 

Peroxisome ProEferator Activated Receptor (Gb230972) 

GTP-blnding protein homologous to Saccharomyces cerevisiae SEC4 

STAM-Hke protein containing SH3 and ITAM domains 2 

Luteinizing Hormone, Beta Subun'rt 

COS antigen; alpha polypeptide (p32) 



small proiine-rich protein 2A 
KIAA0676 protein 
uroporphyrinogen decarboxylase 
protease; serine; 1 (trypsin 1) 
ESTs 

ESTs; Weakly similar to peroxisomal short-chant alcohol 

dehydrogenase [H.saplens] 

ESTs 

zn12c5.s1 Stratagene hNT neuron (#937233) H sapiens cDNA clone 
IMAGE54728 3' similar to TR:G1 151228 G1151228 LPG1P. ;, mRNA seq 
EST 

Human Chromosome 16 BAG clone CIT987SK-A-388D4 

ESTs 

ESTs 

ESTs; Weakly similar to F1 7A9.2 [C.eiegans] 

ESTs 

ESTs 

microtubule-associated protein; RP/EB family; member 3 

immunoglobulin lambda gene duster 

ESTs 

ESTs 

phosphoproteln regulated by mitogenic pathways 

KIAA0929 protein Msx2 interacting nuclear target (MINT) homotog 

ESTs 

ESTs 

ESTs 

Human mRNA for p52 and p64 Isoforms of N-Shc; complete cds 

DKFZP586E0820 protein 

ESTs 

EST 

glucagon receptor 

RAB3A; member RAS oncogene family 

ESTs; Moderately similar to T-complex protein 10A IKsapiens] 

ESTs 

ESTs 

EST 

OKFZP434K151 protein 
leucine-rfch; glioma Inactivated 1 
casein kinase 2; alpha 1 polypeptide 
bagpipe homeobox (Drosophila) homotog 1 
ESTs 

adenylate kinase 5 
ESTs 

endoglin (Osler-Rendu-Weber syndrome 1) 
ESTs 

inositol 1;4;5-trisphosphate 3-kinase A 
adaptor-related protein complex 4; mu 1 subunit 
EST 

Human tg J chain gene 

KIAA0639 protein 

H4 hlstone family; member J 

ESTs 



0.086 
0.086 
0.088 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0.087 
0.087 
0.087 
0.087 
0.087 
0.087 
0.087 
0.087 
0.087 
0.0B7 
0.087 

0.087 
0.087 

0.087 
0.087 
0.087 
0.087 
0.088 
0.088 



Oj088 
0.088 
0J088 
0.088 
0.088 



0.088 
0.088 
0.088 



0.088 
0.088 
0.068 
0.088 
0.088 
0.088 
0.089 



0.089 

0.089 

0.089 

0.089 

0.089 

0.089 

0.089 

0.089 

0.039 

0.089 

0.089 

0.089 

0.089 

0.09 

0.09 

0.09 

0.09 
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132404 
122695 
125975 
110783 
129860 
120740 
119564 
134474 
119014 
109791 
117605 
121589 
104328 
129861 
102795 
119626 
110516 
105382 
123754 
108008 
121057 
123675 
135194 
127070 
134051 
133382 
103615 
118457 
118504 
112915 
132088 
101504 
112550 
128551 
112879 
127079 
101993 
113020 
120465 
130152 
104941 
110090 
135375 
123799 
45 118966 
116969 
125147 
100836 
114726 
107311 
112863 
129290 
103384 

112508 
111863 
131184 
107420 
111768 
112290 
130581 
120744 
112226 
116154 
102640 
129797 
102705 
132408 
108441 
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AA393903 

AA456048 

AA495891 

N23669 

AA410343 

AA302650 

W38206 

AA054746 

N95435 

F10669 

N35073 

AM16627 

D81655 



50 



55 



60 



65 



U88667 

W49499 

H53894 

AA236853 

AA609964 

AA039430 

AA398619 

AA609474 

C20975 

AA641812 

S67070 

AA1 12532 

Z46967 

N66593 

N67334 

T10176 

AA470121 

M27288 

R71391 

H09058 

T03541 

AJ 364691 

U01062 

T23830 

AA251505 

U32645 

AA065169 

H16076 

AA480888 

AA620418 

N93438 

K80633 

W38150 

HQ4113+iT4383 

AA132509 

T57738 

T03148 

AA521407 

X92762 

R68213 

R37495 

AA452705 

W26567 

R27606 

R53940 

AA481982 

AA302772 

R50761 

AA460951 

U67674 

X53595 

U77180 

AA035547 

AA079079 



Hs,4768 

Hs.89403 

Hs.152290 

Hs.26407 

Hs.129826 

Hs.96654 

Hs£379 

Hs£5144 

Ms.13228 

Hs.44433 

Hs.191598 

Hs.143067 

Hs.129849 

Hs.198396 

Hs.134456 

Hs37368 

Hs.1 11801 

Hs.102021 

Hs.61920 

H3.142375 

Hs.1 12713 

Hs.9613 

Ks.190037 

Hs.78846 

Hs.7247 

Hs.1 15460 

Hs.49230 

Hs.50153 

Hs.4254 

Hs.243960 

Hs.248156 

Hs.29074 

Hs.237323 

Hs.1 15960 

HS.128628 

Hs.77515 

Hs.7303 

Hs.130851 

Hs.151139 

HS.17805 

Hs.6915 

Hs.99741 

Hs.1 12861 

Hs.76907 

Hs.143038 



Hs.103827 
Hs.174112 
HS.4610 
HS.1 10095 
Hs.79021 

HS28847 

HS-23573 

Hs.23954 

Hs.4775 

Hs.24185 

Hs.26016 

Hs.16258 

Hs.228649 

Hs.25738 

Hs57100 

Hs.194783 

Hs.1252 

Hs.50002 

Hs.47822 



ESTs 0.09 

ESTs; Moderately similar to undulin 2 [H.saptens] 0.09 

ESTs; Highly similar to PACAP type-3/VIP type-2 receptor [H.sapiens] 0.09 

ESTs 0.09 

tetraspan transmembrane 4 super family 0.09 

EST 0.09 

Accession not listed in Genbank 0.09 

ESTs 0.09 

ESTs 0.09 

DRE-antagonist modulator; calsentlin 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs 0.09 

DKFZP564M182 protein 0.09 

ATP-bindtng cassette; sub-family A (ABC1); member 4 0.09 

ESTs; Wkly smtr to I! ALU SUBFAMILY SX WARNING ENTRY El [H.sapiens] 0.09 

EST 0.09 
Homo sapiens mRNA; cDNA DKFZp564H2023 (from clone DKFZp564H2023) 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs; Moderately similar to putative envelope protein [H^apiens] 0.091 

EST _ 0.091 

ESTs; Highly similar to angtopoietin-related profiln [H.sapiens] 0.091 

ESTs 0.091 

heat shock 27kO protein 2 0.091 

ESTs 0.091 

calictn 0.091 

EST 0.091 

ESTs 0.091 

ESTs 0.091 

HLA-B associated transcrtpt-3 0.091 

oncostatln M 0.091 

ESTs 0.091 

N-acetylgtucosamine-phosphate mutase; DKFZP434B1 87 protein 0.091 

ESTs 0.091 

ESTs; Moderately similar to CL3BC [R.norvegtcus] 0.091 

inositol 1 ^triphosphate receptor; type 3 0.091 

ESTs; Weakly similar to PROHIBITS [H^apiens] 0.091 

ESTs 0.091 

E74-like factor 4 (ets domain transcription factor) 0.091 

ESTs 0.091 

ESTs 0.091 

ESTs; Weakly similar to BRAIN PROTEIN H5 [H.sapiens] 0.091 

ESTs 0.092 

ESTs; Highly simltarto HSPC002 [H^apiens] 0.092 

ESTs 0.092 

Accession not listed in Genbank 0.092 

Olfactory Receptor Or17-201 0.092 

EST 0.092 

ESTs 0.092 

EST 0.092 

ESTs 0.092 
tafazzin (cardiomyopathy; dilated 3A (X-linked); endocardial 

fibroelastosis 2; Barth syndrome) 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to KIAA0584 protaln (H^apiens] 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly simitar to RAS-R ELATED PROTEIN RAB-5A [H^apiens] 0.092 

EST 0.093 

ESTs 0.093 

ESTs 0.093 

solute carrier family 10 (sodium/bile acid cotransporter family); member 2 0.093 

apolipoproteln H (beta-2-gtycoprotetn I) 0.093 

small Inducible cytokine subfamily A (Cys-Cys); member 1 9 0.093 

KIAA0380 gene product; RhoA-specific guanine nucleotide exchange factor 0.093 
zm97c9.s1 Stratagene colon KT29 (#937221) Homo sapiens cONA done 
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117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113813 
107769 

114968 

130297 
109589 
112592 
102314 
116128 
106809 



130607 
120592 
117230" 
105948 
101333 
101909 



127034 
134430 
120342 
104450 
130902 
102708 
107373 
123569 
102687 
128888 
100283 
102747 
107798 
123565 
116010 
117155 
133094 
113174 
102016 
130126 
134813 
132055 
122229 
127574 
134432 
128052 
101637 
103386 
133079 
120328 



AA054133 
AA449990 
M64358 
AA401958 

N49065 

AA422049 

U33053 

U79255 

T10069 

H41281 

R66896 

X59303 

AA447964 

R22891 

N34933 

W45174 

AA018449 

AA250743 

H94949 

F02429 

R77631 

U34038 

AA459915 

AA479704 



AA043894 

AA281929 

N20535 

AM04597 

U7738 



AA497031 

AA3S2389 

H52105 

AA207105 

L77564 

AA424530 

U77594 

U85773 

AA608952 

U73379 

AA034951 

D43642 

U79303 

AA019346 

AAS08907 

AA449450 

H97536 

AA1 15572 

T54659 

U03270 

AB002318 

X14767 

N69440 

AM36198 

AA907314 

AA053022 

AA878398 

M58285 

X92972 

AM77561 

AA196979 



Hs.63085 
Hs.76057 

Hs.240170 

Hs.125201 

Hs.40780 

Hs.2499 

HSJ26468 

HS.101094 

HS.107619 

HS28788 

Hs.159637 

Hs.6311 

Hs.7093 ■ 

Hs.44664 

Hs.31382 

HS.125220 

Hs.92198 

Hs.171955 

Hs.6581 

Hs.29126 

Hs.154299 

Hs.112193 

Hs.220324 



Hs.16603 

Hs.143974 

H3.43265 

Hs.7133 

Hs.80313 

Hs.8657 

HS.8309 

Hs.45068 

Hs.103978 

Hs.21061 

Hs.37682 

Hs.154695 

Ks.195292 

Hs.93002 

Hs.106893 

Hs.2430 

Hs.82482 

Hs.60918 

Hs.1 12614 

Hs.56421 

Hs.42391 

Hs.64746 

Hs.9779 

Hs.122511 

Hs.150443 

Hs.89768 

Hs.38132 

Hs.103902 

Hs.188905 

Hs.8312 

Hs.190491 

Hs.1 32834 

Hs.80324 

Hs.6449 

Hs.104129 



IMAGE:545872 3* similar to contains element MER22 MER22 repetitive 

element ;,mRNA sequence 0.093 

ESTs 0.093 

tysophospholipase II 0.093 

Human rhom-3 gene, exon 0.093 
ESTs; Moderately similar to alternatively spliced product using 

exon13A[H.sapiens] 0.093 

ESTs; Weakly similar to 87 [M.musculus] 0.093 

ESTs 0.093 

protein kinase (Mike 1 0.093 

amyloid beta (A4) precursor protein-binding; family A; member 2 (X1 Mike) 0.093 

ESTs 0.093 

ESTs 0.093 

ESTs 0.093 

vatyWRNA synthetase 2 0.093 

ESTs 0.093 

ESTs 0.094 

EST 0.094 

ESTs 0.094 
Homo sapiens DMA from chromosome 19-cosmids R30102:R29350:R27740 

containing MEF2B; genomic sequence 0.094 
ESTs; Highly similar to calcium-regulated heat stable protein 

CRHSP-24[H.sapiens] " 0.094 

trophinln-assisting protein (tastin) 0.094 

ESTs 0.094 

ESTs 0.094 

coagulation factor II (thrombin) receptor-like 1 0.094 

mutS(E.coli)homolog5 0.094 
Human DNA sequence from done 283E3 on chromosome 1p36.21-36.33. 
Contains the alternatively spliced gene for Matrix Metalloprotsinase In the 
Female Reproductive tract MIFR1; -2; MMP21/22A; -B and -C; a novel gene; 

the alternatively spliced CDC2L2 gene for 0.094 

ESTs 0.094 

ESTs 0.094 

melastatin 1 0.094 

ESTs 0.094 

p53 Inducible protein 0.094 

Homo sapiens mRNA for PLE21 protein; complete cds 0.094 

ESTs; Highly similar to CTG7a [Haptens] 0.094 

ESTs; Wkly smlr to gtucose-6-phosphatase catalytic subunit [R.norveglcus] 0.095 

KIAA0747 protein 0.095 

Homo sapiens mRNA; cDNA DKFZp434H43 (from clone DKf=Zp434l143) 0.095 

serine/threonine kinase 22B (spermfogenesls associated) 0.095 

ESTs 0.095 

retinoic acid receptor responder (tazarotene Induced) 2 0.095 

ptiosphomannomutase 2 0.095 

ESTs; Weakly similar to RNA hettease HDB/DICE1 [Haptens) 0.095 

ubiquitin carrier protein E2-C 0.095 

ESTs 0.095 

transcription factor-Cke 1 0.095 

protein predicted by done 23882 0.095 

EST 0.095 

EST - 0.095 

ESTs; Weakly similar to Similarity to H.influenza ribonudease PH [C.elegans] 0.095 

EST 0.095 

chloride Intracellular channel 3 0.095 

ESTs 0.095 

centrin; EF-hand protein; 1 0.095 

KIAA0320 protein 0.095 

gamma-aminobutyric add (GABA) A receptor; beta 1 0.095 

ESTs 0.095 

ESTs 0.096 

ESTs 0.096 

ESTs 0.096 

ESTS 0.096 

hematopoietic protein 1 0.096 

protein phosphatase 6; catalytic subunit 0.098 

ESTs 0.098 

ESTs; Weakly similar to protease [H.sapiens] 0.096 
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107640 


AA009615 


Hs.257808 


ESTs 


123369 


AA521176 


Hs.221231 


ESTs 


103222 


- X74795 


Hs.77171 


minichromosome maintenance deficient (S, cerevisiae) 5 (eel! division cycle 46) 


111704 


R22450 


HS23396 


ESTs; Highly similar to ZINC FINGER PROTEIN 140 pisaplens) 


126856 


AAS06523 




EST177475 Jurkat T-ceRs VI Homo sapiens cONA 5' end, mRNA sequence. 


127071 


AA250806 




ESTs 


114650 


AA056755 


Hs.151714 


ESTs 


125955 


AI356943 


Hs.143761 


ESTs 


134363 


M37033 


Hs.62212 


CD53 antigen 


128550 


W76492 


Hs.170142 


ESTs 


122598 


AA453465 


Hs.99329 


ESTs 


118898 


N90703 


Hs.4236 


KIAA0478 gene product 


117661 


N39092 


Hs.44940 


ESTs 


120996 


AA398281 


Hs.143684 


ESTs 


123388 


AA521172 


Hs.134417 


ESTs 


106700 


AA463929 


HS58701 


ESTs 


112962 


T16814 


Ks.6828 


ESTs 


121262 


AA401372 


Hs.97723 


ESTs 


134551 


R44839 


Hs.8526 


i-beta-1 ;3-N*cetyig!ucosaminyttransferas6 


112060 


R43754 


Hs.21164 


ESTs 


134678 


AA039935 


Hs.182595 


dynein; axonemar; light polypeptide 4 


100855 


HG4234-HT4504 




Methytenetetrahydrofolate Reductase 


132414 


N91193 


H&48145 


ESTs 


112900 


T08758 


Hs.3813 


ESTs 


115989 


AA447777 


Hs.93135 


ESTs 


103561 


Z21488 


Hs.143434 


contactinl 


131087 


AA009738 


Hs.22824 


ESTs; Weakly similar to p160 myb-blnding protein [M.musculus] 


120293 


AA190859 


Hs.191428 


ESTs 


111830 


R36081 


Hs.25085 


EST 


113654 


T95770 


Hs.17666 


ESTs 


132675 


AA179338 


Hs.5476 


serine proteinase Inhibitor 


120162 


Z40125 


Hs.91968 


ESTs 


132879 


U16282 


Hs.5881 


ELL gene (11-19 lyslne-rich leukemia gene) 


134211 


AA056681 


Hs.80021 


ESTs; Weakly simitar to 62D9.p [Cmelanogaster] 


115448 


AA284845 


Hs.165051 


ESTs 


118118 


N56901 


Hs.47995 


ESTs 


107598 


AA004528 


Hs.1 69444 


ESTs 


128933 


H01824 


Hs.760 


GATA-blnding protein 2 


114892 


AA235988 


Hs.86024 


ESTs 


101922 


S75168 


Hs^74 


megakaryocyte-associated tyrosine kinase 


105444 


AA252374 


Hs.19333 


ESTs; Weakly similar to ATP(GTP)-binding protein [Haptens] 


128155 


AA926843 


Hs.143302 


ESTs 


116276 


AA485870 


Hs.44914 


ESTs 


111964 


R41227 


Hs£1860 


ESTs 


135100 


AA398926 


Hs£51108 


Homo sapiens mRNA; chromosome 1 specific transenpt KIAA0493 


124872 


R69251 


Hs.101506 


EST 


103084 


X59932 


Hs.77793 


c-src tyrosine kinase 


124138 


H23199 


Hs.107010 


ESTs 


130048 


R31745 


Hs£11612 


SEC24 (S. cerevisiae) related gene family; member A 


100208 


D26129 


Hs.78224 


ribonudease; RNase A family; 1 (pancreatic) 


123537 


AA608775 


Hs.1 12589 


ESTs 


118999 


N95019 


Hs.55092 


ESTs 


119847 


W80384 


HS.9853 


ESTs 


112819 


R98618 


Hs.35984 


ESTs 


131080 


J05008 


Hs.2271 


endothetin 1 


127353 


AA190853 


Hs.155360 


ESTs 


132068 


X66365 


Hs.38461 


cyclln-dependent kinase 6 


105744 


AA293436 


Hs.12909 


ESTs 


133680 


M92357 


Hs.101382 


tumor necrosis factor; alpha-induced protein 2 


122899 


AA469960 


Hs.178420 


ESTs; Highly similar to WASP Interacting protein [H^apiens] 


128700 


U59286 


Hs.103982 


small inducible cytokine subfamily B (Cys-X-Cys); member 11 


104393 


H46486 


HS.226499 


nesca protein 


123320 


AA496792 


H3.139572 


EST 


129169 


N31641 


Hs.109058 


ribosomal protein S6 kinase; 90kO; polypeptide 6 


135093 


U51333 


Hs.159237 


hexokinase 3 (white cell) 


113269 


T65159 


Hs.85044 


ESTs 


124283 


H86783 


Hs.194136 


ESTs; Moderately similar to Zinc finger protein RIN ZF [R.norvegicus] 


114376 


GMCSF 




Accession not listed in Genbank 


100881 


HG4458-HT4727 




Immunoglobulin Heavy Chain, Vdjc Regions (Gb±23563) 



0.096 
0.096 



0.096 
0.733 
0.096 
0.096 
0.096 
0.096 
0.098 
0.096 
0.096 
0.096 
0.096 
0.096 
0.096 



0.096 
0.096 
0.098 
0.096 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.097 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.098 
0.096 
0.098 
0.098 
0.098 
0.098 
0.099 
0.099 
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116572 


045654 


Hs.65582 


DKFZP586C1324 protein 


0,099 
0.099 


123956 


AA621747 


Hs.1 12847 


EST 


100818 


HG4018-HT4288 




Oploid-Binding Cell Adhesion Molecule 


0.099 
0.099 


132754 


W47419 


H&56007 


Human DNA from chromosome 19-spedfic cosmld F25965; genomic sequence 


112741 


R93080 


Hs.35035 


ESTs 


0.099 


112748 


R93299 


Hs.166492 


ESTs m 


0.099 


130858 


S57235 


Hs.246381 


CD68 antigen 


0.099 
0.099 


124870 


R69233 


Hs.101504 


ESTs 


125304 


Z39833 


Hs.124940 


GTP-binding protein 


0.099 


121297 


AA401995 


H&97860 


ESTs 


0.099 


128602 


AA046103 


Hs.102367 


ESTs 


0.099 


124062 


H00440 


Hs.144524 


ESTs; Weakly similar to signal transducer and activator of ■ 


0.099 








transcription 2 [M.musculus] 


100547 


HQ2149-HT2219 




Mucin (Gb:M57417) 


0.099 


105652 


AA282505 


Hs.19015 


ESTs 


0.099 


133390 


AA459945 


Hs.72660 


KIAA0585 protein ^ 


0.099 


133503 


M33195 


Hs.743 


Fc fragment of IgE; high affinity I; receptor for; gamma polypeptide 


0.099 


109461 


AA232667 


Hs.58210 


ESTs 


0.099 


102068 


U09117 


HS.6Q778 


phosphotlpaseC; delta 1 


0.099 


113464 


T86931 


Ms.16295 


ESTs 


0.099 


104240 


AB002368 


Hs.70500 


KIAA0370 protein 


0.099 


121113 


AA399109 


Hs.161813 


ESTs 


0.1 


122896 


AA469952 


Hs.97899 


ESTs; Weakly similar to dal2; len:343; CM: 0.17f ALCJfEAST P25335 
AUANTOICASE [S.cerevlslael 


0,1 


102405 


U43148 


Hs.159526 


patched (Drosophila) homotog 


0.1 


103599 


Z33905 


Hs*1218 


receptor-associated protein of the synapse; 43k0 


0,1 


121079 


AA398719 


Hs.14169 


ESTs; Weakly similar to CREB-btnding protein [H^aplens] 


0.1 


115820 


AA427487 


Hs.39619 


ESTs; Weakly similar to RETICULOCALBIN 1 PRECURSOR (H^apiens] 


0.781 


125106 


T95766 


Hs.189760 


ESTs 


0.1 


131373 


N68116 


Hs.26146 


Down syndrome critical region gene 3 


0.1 


120224 


Z41239 


Hs.106960 


ESTs 


0.1 


133090 


AA448228 


Hs.6468 


ESTs 


0.1 


132300 


AA133244 


Hs.44234 


ESTs 


0.1 


113129 


T49384 


Hs.8988 


EST 


0.1 


110638 


H73197 


HS.17241 


ESTs 


0.1 


131364 


R53255 


HS26010 


ESTs 


0.1 


105370 


AA236476 


Hs.22791 


ESTs; Weakly similar to transmembrane protein with EGNike and two 


0238 








folllstatiri-like domains 1 [H^apiens] 
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TABLE 1 1 A shows the accession numbers for those primekeys lacking unigeneD's for 
Table 1 1 . For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number 
100610 19864 J 



100674 21517.2 



108559 41469.9 

100721 19818J 

100748 41861.1 

100750 15759J 



100751 24700J 



100760 1334.7 
100775 18179,3 



Accession 

AW161357 A1878062 AI928938 AW1 61097 AW161 167 BE314465 AA351715 F0709& AA179034 F08510 F00653 AI936671 
AA476718 AW772454 AIB07703 R44253 AA976667 AI985186 AI650254 H38942 R84829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355 AW950556 D51397 AA213981 BE548002 AI056359 AA001560 AW952113 
AA317769 AI857477 AI857475 AW249771 AW162661 H38943 AA018628 R85885 AI984613 AI934765 AI796172 AW157488 
A1929191 R85523 D51221 D53851 H85610 AI749674 F21582 AA323145 AA019127 AA687444 T06745 AI699293 H29532 
AA214029 AA223656 NM.016834 X14474 R19897 H09695 R17455 R13812 R19056 A1681231 AI590200 R37671 AA861B28 
AI990023 AI935669 AW005821 AA324581 H17335 R37659 R42802 R46242 R60936 R59731 H28993 AM79907 R44570 
AI890696 AA308884 AA507078 R41274 AI365507 T16348 AI560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

AW403342 AW248986 BE561709 AA357312 BE311834 BE389496 8E294887 AW732696 BE047868 AI702383 BE019155 
AI702367 BE408966 BE280458 BE313759 BE513492 BE535404 BE280258 AC005263 NM.007165 L21990 AW732711 
AI564920 AW249094 BE265365 AW607186 AW607346 BE005217 H27211 U46230 BE260068 BE207043 BE546782 
AW248659 
AA085228AA085161 

L40904 NM_005037 X90563 AB005626 H21596 AA088517 
X06Q96 X05826 

BE157260 BE157265 R481 18 H43827 Z17877 AW379070 AW291778 M20605 J03253 M14206 V0O568 AI860465 AW296022 
M13930 AL047400 J00120 BE01B476 AW675223 T26980 F06694 R22709 R24720 H22753 AI903100 AI903094 AW937823 
X00364 D10493 K01904 K01906 K00535 L00058 AA410662 AW384760 AA304930 AI680985 XO0198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AW862480 BE223010 AW384172 AW384219 AW384171 AW38421 8 AA298522 
BE140421 AW945162 AW75171 1 AA514409 AW747912 A1214214 W87741 AA972406 AA554513 BE302087 AB49030 
AA477850 AV653129 AI281360 AI2741 10 W87861 AA841366 X66258 AI051600 AA877139 AA527483 AA857219 AI250782 
AA625531 AA807892 AI27B811 AI224033 H24033 AA593398 AW129709 R45453 N22772 AA235530 729737 A1018409 
AI688907 AA568370 AA722760 AI539329 AA550843 AW674698 AI538452 AI538453 AB37957 AA477744 AA464600 
AI140319 AW949294 AI339781 AI828736 AA923634 AA344094 AI278350 AA975567 AA908416 AA857170 AW023520 
R43413 R48004 F02958 AI989439 R1 1207 AA737307 D10493 AW950652 A1093842 AI474Q24 AA703369 R11264 M 13930 
M13930 M13930 M13930 M13930 J00120 M13930 M 13930 X00364 J00120 R19507 AA639812 
N32759 N29730 N30831 N32604 N31955 A1206390 H87574 R23494 AI186215 N30036 AI741512 J00117 NM.000737 
AI453626 AA330974 AI188729 AJ188604 AI188964 N30276 AI188947 AI188830 A11B8303 AI200457 AI219166 AI192459 
AI183260 AI189275 AJ 183639 AI186353 AI189616 AJ184224 AI130720 Al 188454 AJ 188391 AI148857 AI192447 A1209155 
AI190013 AI206355 AI188721 AI189429 AI189364A1 186330 AW31 595 AI1 89595 AI188781 AI148647 AI200022 AI221 552 
AI220923 AM 88728 AA233034 A1189807 AI189641 AI218044 AI148774 A120O658 W71989 AI207360 AI188824 AI20G559 
AI200270 AA644163 AI199943 AI151301 AI189555 AI262724 AI148590 AI148695 AM26906 AI149163 K03183 K03189 
A1189842 AI221014 N30608 AI1 86465 A1220865 AI18849B Al 138226 AJ 189968 AI221019 AI138197 AI149426 AI148904 
AI186218 AI183348 A1160579 AI198460 AI149039 AI160936 AI219055 AI184784 AI221580 AI161082 A1160814 AI123896 
A1417614 AI126101 AI1 88872 AI149571 AI1 68533 AI149072 AI149467 AI1 31286 N30684 AM 60705 AI1 60692 AI149559 
AI273580 AI189442 AI138448 AI149591 N27302 AA400910 AI138431 AI138435 AI128407 N30216 AI128296 AI219589 
AI188492 AI149447 AI168482 H95374 AI219009 N31616 AI276216 N32233 AI291937 N30741 AI1B8689 N27111 R23214 
AI221 605 AM 84348 AI200375 H94451 N26397 AI871881 AA232905 N30833 A1220780 H94448 N30822 H87464 R68815 
N30290 AI128424 H12587 T47334 H87631 H87156 AI219133 AI868741 AA330859 H86993 AA330413 H93656 N30817 
T90191 H93668 AI200054 H95207 T47316 H95381 T49170 R00880 T49171 N27381 H94107 R63352 T85053 AW451899 
H95142 N30313 H9401 5 H86987 T28278 N29701 C18834 AA331267 AA330939 A1654493 N27073 N29831 R681 13 N30758 
R26088 N32108 H95135 AA330414 AA330978 AI219422 Al 189453 Al 199951 X00264 NM.000894 AA371909 AA063498 
T29543 AA371971 AA372026 AA371978 AA371346 AI051683 AI186418 AI220659 AI189068 AI219268 AI186552 AI188715 
AI149156 

AW794626 M27126 M27014 

J05581 M61 170 T27692 M34088 M34089 AW860335 AW579047 AW610437 AW610386 AW61 0422 AW610473 AW579078 
AW604897 AW860163 AW579067 AW862410 AI816584 AW177757 AW602769 AI909790 AW860331 A1909787 AI90981 1 
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10 

100800 24735J 



15 



20 



25 

100818 19604.3 

100881 458J27 

30 100885 12707.3 
100898 8542.1 



35 



40 



45 



50 



65 



102469 3556.1 
126126 1630017.1 
102620 16821.37 
102673 24986.6 
102675 6145.4 
102753 2226.1 
102799 34624.4 
127034 51148.2 

103522 21640J 



127071 188097.1 

55 126456 291965J 

119388 1762256.1 

126856 20669.1 



60 103996 224545.1 



113213 23798.1 



134947 844579.1 
129311 16078.1 



AI909813 AW845083 AI905920AW387919BE140766AI909279AW369405 AA429321 AA429320 AA367451 AA847972 
AW001 137 AI567905 T84561 AI631295 AA151351 H02932 AI884519 AA367457 AW369421 AI676848 AW391803 AI610869 
AW192838 A1922289 AI952140 AI910233 AI479474 AW001 395 AA488073 AI985760 AW130017 AI858369 AA627845 
AW081805 AA158865 AI624443 AA344985 AA569793 R72486 AI589329 AI903204 AI269893 AA641284 A1279932 AA149270 
AI697120 AA7291 46 A1589353 AA480067 AI92331 0 AA5309Q8 AI275395 AA425062 AA580280 AA889527 AA158866 
AW131341 AA573028AA877326 729335 AW951288 H04235 AA099243 AA994659 AI659618 AA8879faAl299297 
AW001 116 AW263844 AI270578 AA970828 AW572126 AA775299 AW369449 AW369398 AW369452 AI933677 AI870710 
AI092911 AI582464 AJ497674 AA937026 AA885865 L38597 AA908325 AW369432 AWQ26623 AA627778 AK64942 
AA932409 A! 187328 AI672970 A1886098 AW440471 AW138860 AI866858 AI802528 AI 925 172 AW243914 AI933690 
AA996114AA536189AW009937Ai918060Al270379AI973169AW175638AW369413 

NM.006227 L26232 R50649 AU077024 AL008726 AM1 1079 R35151 BE2781 53 BE278139 AM59777 R88036 Z43210 
F07326 AF052157 R17844 BE615476 T82160 R71985 H21963 AA299158 AW368248 R48123 R50628 R70441 H27245 
H72015 R72345 R39392 A1909738 BE612778 BE613234 D521 16 D52136 D52132 D52067 D51922 D51995 051905 N34249 
N25459 AA464436 AA297350 AA297466 R81736 H02737 AW582505 R27523 AI834241 AW130867 W72668 W76426 
AA358363 R50262 AW473860 H52335 H43953 H21964 T39505 AI8S7517 AW1 56925 AW839B50 H02628 AW007705 
AI561008 F22392 R71279 AA995433 R50725 W24462 R71931 AA464437 AW591731 R25667 R52695 R50810 AI560805 
AI089266 H68386 H41353 H28590 AW001860 AJ141623 AA250773 A128477B AW51 1412 AW083975 AA130377 AWQ26047 
R50551 R81494 AI357668 AI078272 F32668 F36981 AW304865 H43906 AA931068 R48010 AI540217 AI017339 A1291812 
AI741954 AA458490 AI088378 AA298764 H61 168 AA358362 AA298725 AA298515 AA464148 AA443538 R43046 AA084314 
T40641 T47608 T48940 A1082477 AW470145 N92284 AI758958 AA298512 AA284586 A1597777 AA480277 AI932559 
A1869081 AA476615 AA503651 At656024 AW168522 AI682051 AI689106 AI274592 AI520917 BE258916 BE615861 
BE280282 R53386 BE278255 BE278398 T47607 AA477662 H68385 

100817 19648 1 L34355L46810NM.000023 U08895AA424260AI097272 AA424162 N79764F19290 F25278A1479385 
AA460662 AA432059 AW016935 F25770 F32549 F36677 F33016 F35992 F36010 AW1 72497 AA335076 F28727 AA211643 
AA453282 

U79251 AA843851 R38201 R66461 R44908 AA683289 H17477 R37364 R52832 AW298336 AA351391 NMJJ02545 L34774 
AA296886 AW967001 T28889 R13451 T77331 AL119186 AL1 18830 H08459 AW892812 AW905838 H17585 R52878 
BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BE514250 BE397832 
BE269598 BE559865 BE396881 BE560031 BE514199 BE560037 BE560454 
X07881 NM.006249 X07637 AA376715 AA376677 X07715 X07704 S80916 

BE387614 R51501 AA199714 AW674779 F08178 BE269071 AA376313 H08264 AA380420 H18785 AL042151 BE277758 
BE267438 NM.005850 L35013 BE540833 BE390902 BE391494 BE277459 BE385592 BE390612 BE384263 8E387779 
BE388647 BE537373 BE547158 AW409585 AW374033 AW602185 AA355725 AW577548 AW935015 AW935160 W40232 
AW938647 AW374332 AA434040 BE293488 AL138361 BE560260 AI745075 AA317980 AW949382 AI83431 1 AI653582 
A1831042 AI361878 AA618606 AA729052 A1424969 AA199715 AW769374 A1828422 AW044307 AI862816 A1203583 
AW084461 AW514655 AA831883 AA290672 AA831286 AA578510 AW089965 AW150746 AA292743 H22232 A1469275 
AW439312 AA292744 AW471443 AI473989 AA593336 AA464070 AI678937 AW069451 AA970763 AA610480 AA593328 
AM64009 AA768985 AI298928 AA436600 AA464718 AA699361 D61482 055935 A1369591 AA470695 AI809135 AA640627 
AI568446 R51502 W45467A1655316AA463934 AW168609 AW518663BE045525 Z41251 A1868091 AA908160 AI026697 
AI886259 AI612932 AA215437 AI956014 BE541087 BE255652 BE265878 BE3941Q2 W27502 
U48936 L36592 X87160 NM.001039 AL036606 AL036420 U35630 AW298574 
W80551 M85370 
AA976427 US6052 
AI457548 U72509 
U72512T98357 R31335 F18090 

L32961 NM_000663 U80228 S75578 AA425061 AA429317 AI815143 AA910669 AI286022 AI286019 
U88896 U88898 AA916056 T03285 AI341594 A1359534 AI634031 U88897 

BE397750 AA232171 BE562900 BE384894 BE242228 BE206819 BE261742 AA296468 AW959763 BE276164 BE2641 09 
BE392628 BE256735 AA301453 N55872 H0 1676 AA292746 AA427485 AA496400 AA352389 

Y10518 Y10514 Z83935 Y10508 AK000055 Y10519 AI142012 AI681 175 BE222219 AA890586 BE504347 BE328064 N63044 

N51226 A1151248 AI521996 AI924777 AW375954 A1860275 W00549 AI742673 AW612288 AI763062 AA632S10 AI087347 

AI088070 AI214349 AA890297 AI494156 AI698598 AA631658 AA504593 AA860733 AI266761 AW663214 AW771231 

AA639610 AI769806 A1769746 AW014326 AI28B61 1 

AA250806AA459220 

AA429212W00881 

T88798 R92430 

AI084125 AKJ83773 A14796B7 AI939609 AI968662 AF129507 NM.013282 AW971840 AW298508 AA744240 AA81 1217 
AA827671 AA811055 AA806567 AA488977 AA908902 AI637637 AA927056 AI870139 AW340492 AA488755 AA129794 
AA306523 AA354253 BE256277 AC053467 AW962084 

AA321355 AW964592 R23284 H73883 R23382 N47914 C01377 H04668 AW606248 R34447 AA847136 AI684489 AI5231 12 
AW044269 AI379138 N29366 AA761543 N79248 AA960845 AA768316 AI147926 A171B599 AI880620 R67467 AI216016 
AI738663 H04648 

NM.001395 Y08302 AI434619 AI470328 AI261807 AW024965 AI806537 AI830549 A1640337 AI219065 AW271700 
AW028488 AI133339 AI859205 R51 1 75 U87167 BE379324 BE392008 AA340819 AA3431 10 T57275 059164 AW29931 2 
AI434422 AJ936390 AW024975 R40262 

AW269126 R09430 T56590 A1367247 AI253132 BE464248 T58658 AW207785 T58607 
R51 194 A1732276 R53587 AI820697 

AK000526 BE550084 W30689 AW271859 AA41 1456 AI341551 AA242990 AA243027 H87046 D20360 AI184053 AA146956 
AI721023 AI718944 AA146955 F18215 AA903890 AI700355 AI075430 AA41 1584 AA87821 0 AI476760 AW945637 AA630596 
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AA431522 AA301989 AI909058 D12149 N41960 BE222214 AA609922 AA828176 AA393359 AA398693 AW024956 
BE467805 AW298623 AW264085 AI024454 AI024719 AI431 927 T55087 AI61 1014 T54920 AA131253 A1436344 

114427 9724J AA017176 AI359979 AA047836 AA017063 AA0 16303 AA001545 

114569 110077J AA063315 AA063316 

100106 1562L-5 AF015910 

100515 342 1 AA305746 D90187 T63943 AW951 154 T29182 AI734941 D13264 AI299239 Z18812 AW299859 W24476 AA933064 

AA489759 

100531 46038 1 AW888554 AW607282 AA31 9986 M28590 § 

100545 22955 11 M55405 A W752552 4 

100574 17320J AA326695 M1 0038 NM_000365 N84665 H694U N84657 AA380453 AA329743 AA357367 AA188770AA376532 AA353653 

AA158953 AA083176 BE537313 AA181433 D53373 R57376 AA206698 R14807 H18899 H1 1 191 H93892 R25593 761 134 
N93285AA083081 AA831789H13137AA497014 AA079330 AA182861 H13138W47161 R62913/86B7089 AA211112 
AA429237 AL035923 AA 100070 AW392393 AI566433 AAB66006 AA214002 AW392865 N79454 M'197181 AI660371 
AA176501 AA737967 AI089225 F34874 AW571437 AI620620 AA573489 AM23816 AA164917 AA4®455 T47072 A1569087 
AI261656 AA730919 AI633441 AW195182 AI351622 AW243465 A1872649 AI359227 AA987941 AI693770 T47073 AW779948 
AW510580 AI635626 AW627601 AA864326 AA953578 AI341418 BE222853 AI241963 AI094663 AA528330 AA493373 
AW043762 AI377783 AW958987 BE619760 AA385240 BE277975 BE280095 AW631443 AA581048 BE616715 BE299610 
C14874 BE559858 BE378455 BE618290 BE544585 AI525575 BE548897 BE2671 10 AA804738 BE269821 AA9181S3 
BE277647 AA599947 BE280735 BE39G239 N74150 T12S04 AI208197 AW955527 AA1 13897 N40031 H73835 H70393 
A1434041 W22950 A) 192661 BE264461 W26486 AA626424 AA196694 T69209 AA857976 AI540287 AA41Q599 AA864287 
AW950564 AA013320 T49283 AI541438 AW804703 AA335534 AA335659 BE562269 BE61 8802 BE277B50 BE546413 
BE280994 AA204813 BE561694 BE543524 BE253647 AW001452 W191 16 BE542508 AA205894 BE254375 BE270033 
AI525906 BE251792 AA975700 BE272138 AW607671 N87686 M 10036 BE515060 BE288607 AI745178 U47924 H03193 

100527 tigr_HT2798 Z25424 

100756 tigr HT3768 M88357 

100768 ttgr HT3846 L29141 M69180MB1105 

100813 tIgr_HT4265 L33999 

100836 tfgr_HT4383 U 04688 

100855 tigr_HT4504 U09806 

102104 entr«LU12139 U12139 

125091 genbankJ91518 T91518 

100929 tlgr_HT688 X65561 

125147 _entreO/V38150 W38150 

102354 entrezJJ38268 U38268 

102491 entre*_U51010 U51010 

102636 entrez_U67092 U67092 

118769 genbanK_N74496N74496 

101046 entre*_K01160 K01160 

101057 entrez_K03430 K03430 

108334 genbanKjAA070473 AA070473 

108417 483241.1 AA070853 AA075749 AA075716 

108441 genbankJWKWW AA079079 

108786 gertbanR_AA128999 AA128999 

101655 entrezLM6Q299 M60299 

101697 entre3LM64358 M64358 

117437 genbanK_N27645 N27645 

101798 errtre*_M85220 M85220 

101909 entrez.569265 S69265 

103508 entre*_Y10141 Y10141 

103575 entre^26256 Z26256 

119332 genbanKJ54095 T54095 

112161 genbank_R48295 R48295 

119564 NOT_FOUND_entre^_W3a206 W38206 

114376 N0T_F0UND_entre?J3MCSF GMCSF 

100478 tigr_HT1067 M22406 

100547 tigr_HT2219 M57417 

100564 figr__HT2324 Z11585 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Aff ymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Untgene number 

Unigene Title: Unigene gene title 

R1 : Background subtracted normal prostate : prostate tumor tissue 



Pkey ExAccn UnfgenelD Unigene Title R1 

100522 HG1763-HT1780 Prolactfn-lnduced Protein 17.4 

130803 M81650 Hs.1968 semenogetinl 16.765 

118068 N53943 Hs.13743 ESTs . 13225 

114251 Z39898 Hs.21948 ESTs ~* 127 

112134 R46025 Hs.7413 ESTs 8.735 

101436 M20642 Hs. 158295 Human alkali myosin light chain 3 mRNA; complete cds 8.175 

104028 AA361094 Hs221128 ESTs 8.15 

108344 AA149204 Hs.175783 ESTs; Highly similar to growth arrest inducible gene product [H.saplens] 7535 

103838 AA174173 Hs, 12622 ESTs 7212 

120469 AA251741 Hs25882 DKFZP586M1 824 protein 7.175 

110279 H29231 Hs27384 ESTs 6.701 

127472 AA761378 Hs.192013 ESTs 6.642 

133301 N35229 Hs.7037 pallid (mouse) homolog; pallidal 6.411 

102457 U48807 H&2359 dual specificity phosphatase 4 6.395 

114011 W90385 Hs.15082 ESTs 6.15 

101249 L33881 Hs.1904 protein kinase C; iota 6 

123265 AA491209 Hs.1 05265 ESTs; Weakly similar to reverse transcriptase [M.musculus] 6 

119322 T49655 H&241569 ESTs; Modry smlr to I! ALU SUBFAMILY SQ WARNING ENTRY It [H^aplens] 5.95 

101673 M61 906 Hs.6241 phosphoinositide-3-kinase; regulatory subunit; polypeptide 1 (p85 alpha) 5.925 

115586 AA399218 Hs.92423 ESTs 5.7 

120590 AA281780 Hs.1 11441 ESTs; Weakly similar to similar to KruppeMike zinc finger protein [Celegans] 5.7 

109748 F10192 Hs248323 Tubulin; alpha; brain-specific 5.625 

134727 X80507 Hs.8939 yes-associated protein 65 kDa 5.5 

129171 AA234048 Hs.7753 calumenln 5.486 

120390 AA233122 Hs.1 11460 ESTs; Highly similar to multtfun to 

kinase ll delta2 isoform [H.saplens] 5.4 

131699 R68657 Hs.90421 ESTs; Modly smlr toll ALU SUBFAMILY SX WARNING ENTRY 11 [H.sapiens] 5.279 

104490 N71503 Hs.43087 ESTs; Weakly similar to dysfertln [H.sapiens] 5.266 

102124 U14528 H&29981 solute carrier family 26 (sutfate transporter); member 2 5.151 

109280 AA196635 H&86081 ESTs 5.134 

109707 F09739 Hs.1 85701 Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 21920 5.075 

108087 AA045709 Hs.40545 ESTs 5,075 

135006 M21665 Hs.929 myosin; heavy polypeptide 7; cardiac muscle; beta 5.055 

119182 R80664 Hs.77067 ESTs - 5.033 

129806 R62444 Hs.173373 KIAA0931 protein 4.675 

101435 M20543 Hs.1288 actin; alpha 1; skeletal muscle 4.626 

125954 R93943 yt72c12.rt Soares retina N2b4HR Homo sapiens cDNA clone IMAGE275735 5\ 4.6 

113989 W87544 H&221184 ESTs 4.559 

104432 J03460 Hs.99949 proiactin-lnduced protein 4.451 

112326 R56068 Hs.4263 ESTs 4.45 

119063 R16833 Hs.53106 ESTs; Weakly similar to 1! ALU SUBFAMILY J WARNING ENTRY 1! [H.sapiens] 4.45 

130376 R40873 Hs.1 55174 KIAA0432 gene product 4.301 

122484 AA448286 Hs.98074 ESTs; Highly similar to atrophin-1 interacting protein 4 [H^apiens] 4.2 

104142 AA447006 ESTs; Moderately similar to II ALU SUBFAMILY SQ WARNING 4.175 

129413 N32787 Hs.1 1123 ESTs; Moderately similar to hypothetical protein 2 [H.saplens] 4.1 

103678 Z84483 Human DNA sequence from PAC 46H23, BRCA2 gene region chromosome 13q12-134.Q5 

114266 Z40186 Hs.26409 ESTs 4.05 

115206 AA262491 Hs.1 86572 ESTs 4.048 

123723 AA609749 Hs.1 12759 ESTs; Highly similar to unknown protein [R.norvegIcus] 4.041 

129130 H97993 Hs.1 72788 ESTs; Weakly similar to KIAA0512 protein [H.sapiens] 4.026 
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120217 Z41078 Hs.66035 ESTs 4.028 
108536 AA084524 zn19d8.s1 Stratagene neuroepitheiium NT2RAMI 937234 Homo sapiens cONA 4.023 

134460 AA400030 Hs.8380 ESTs; Weakly similar to I! ALU CLASS B WARNING ENTRY It [H,saplens] 3.925 

120418 AA236010 Hs26613 Homo sapiens mRNA; cDNA DKFZp588F1323 (from clone DKFZp586F1323) 3.91 . ' 

132783 N74897 Hs5683 DEAD/H (Asp-Gtu-Ala-Asp/His) box polypeptide 15 3.889 

125052 T80174 Hs222779 ESTs; Moderately similar to similar to NEDD-4 [H,sapiens] 3.85 
108600 AA099585 Hs.41175 ESTs 3.833 
103099 X61100 Hs.8248 NADH dehydrogenase (ubiquinone) Fe-Sproteb 1 (75kD) (NADH-coenzyme 3.818 
134948 H06773 Hs.93850 protein kinase; AMP-activated; gamma 2 non-catalytic subunit 3.792 > 
120511 AA258144 Hs221576 ESTs 3.779 
111861 R37460 Hs25231 ESTs 3.768 
113966 W86600 Hs.9842 ESTs 3.75 
131649 AA481254 Hs.30120 ESTs 3.708 
129775 R94659 Hs.12420 ESTs 3.707 
110191 H20568 Hs27182 phosphoflpase A2-actlvatfng protein 3.7 
112678 R87160 Hs.33665 ESTs 3.7 
127115 AA375791 Hs.131894 ESTs 3.674 
132892 W92797 Hs.59378 0KFZP434G1 62 protein 3.653 
115023 AA252079 Hs.63931 dachshund (Drosophila) homolog 3.625 
114932 AA242751 Hs.16218 KIAA0903 protein 3.62 
106885 AA487228 Hs.19479 ESTs 3.614 
134480 AA024664 Hs.83916 NADH dehydrogenase (ubiquinone) 1 alphasubcomplex;5(13kD;B13) ^ 3.613 
124760 R42493 Hs220839 ESTs 3.6 
130831 AA025399 Hs.169737 ESTs 3592 
134154 AA211320 Hs.79404 neuron-specific protein 3568 
104160 AA455706 Hs.99722 ESTs; Weakly similar to 78 KD GLUCOSE REGULATED PROTEIN 

PRECURSOR 3559 

105524 AA258158 Hs22153 ESTs; Weakly similar to KIAA0352 [H.sapiens] 3542 

110168 H19673 Hs.176586 ESTs 3525 

109480 AA233299 Hs.72158 ESTs 3522 

109685 F02367 Hs27252 ESTs 35 

115134 AA257107 Hs.194331 ESTs 35 

1 16083 AA455653 Hs.44581 ESTs; Weakly similar to HEAT SHOCK 70 KD PROTEIN 6 [Haptens] 3.459 

120524 AA261852 Hs.192905 ESTs 3.45 

116932 H74330 Hs.150000 ESTs 3-425 

130746 AA256976 Hs.18800 ESTs; Weakly similar to KIAA0579 protein [H.saplens] 3.42 

107513 X05451 Hs.158295 Human alkali myosin light chain 3 mRNA; complete cds 3.417 

118641 N70298 Hs.49829 ESTs 3.407 

126584 AI028384 Hs.127331 ESTs 3.399 

105134 AA159953 Hs22895 ESTs; Weakly similar to arylsulfatase B precursor [H.sapiens] 3.325 

123502 AA600116 Hs.1 12528 ESTs 3.318 

132389 N50866 Hs.47135 ESTs 3.317 

105691 AA287097 Hs.75356 transcription factor 4 3.315 

131505 H85897 Hs27755 ESTs 3.309 

120775 AA342104 Hs.96777 EST 3.3 

1(^579 AA278824 Hs.19218 ESTs 3295 

128190 AA946876 Hs.148376 ESTs 3292 

100819 HG4Q20-HT4290 Transglutaminase 3288 

130217 D29956 Hs.1 52818 ubiquitin specific protease 8 3273 

130068 AA608903 Hs.1 06220 KIAA0336 gene product 3269 

134719 L07515 Hs.89232 chromobox homolog 5 (Drosophila HP1 alpha) 3.266 

1 10277 H29209 Hs.151231 ESTs; Highly similar to FYVE finger-containing phosphoinositide kinase [Monuscutus] 326 

127354 AA418880 Hs.185797 ESTs ■ 3212 

129173 R60523 Hs.109087 ESTs 3.197 

127464 AA970504 Hs.146103 ESTs 3.179 

124923 R94500 Hs.108046 ESTs 3.175 

122465 AA448164 Hs.99153 ESTs; Highly similar to CGt-73 protein [H-sapiens] 3.151 

122027 AA431302 Hs.98721 EST; Weakly similar to N-copine [H.sapiens] 3.151 

103329 X85134 Hs.72984 retinoblastoma-binding protein 5 3.15 

129937 M95767 Hs.135578 chitobiase; di-N-acetyl- .3.15 

134197 AA057341 Hs.67689 helicase-mo! 3.15 

107764 AA018219 Hs.226923 ESTs 3.125 

121775 AA421773 Hs.161008 ESTs 3.125 

114768 AA149007 Hs.182339 Ets homologous factor 3.12 

132381 N48818 Hs.46884 ESTs 3.11 

123105 AA485973 Hs.143947 ESTs 3.104 

121176 AA4OOC80 Hs.97774 ESTs 3 - 1 

125053 T80620 Hs.186473 ESTs 3 - 075 
105909 AA401739 Hs5111 ESTs 
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119797 W72562 Hs.58119 ESTs 3.057 

115776 AA424038 HS38197 ESTs 3.056 

111713 R22988 Hs.220950 ESTs 3.05 

115301 AA280047 Hs43948 ESTs 3.05 

5 118448 N66412 Hs.49189 ESTs 3 

105586 AA456598 Hs.256269 ESTs 2.995 

1-10415 H48239 Hs.29739 ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-3A [Ksapiens] 2.979 

105173 AA182030 Hs.8364 ESTs 2.978 

101102 L07594 Hs.79059 transforming growth factor; beta receptor III (belaglycan; 300kD) 2.978 

10 110543 H58383 Hs.258544 ESTs 2.976 

125593 R24464 Hs.202949 KIAA1 102 protein 2.964 

100824 HG4058-HT4328 Oncogene Ami 1-EvM ( Fusion Activated 2.957 

106822 AA481068 Hs.31835 ESTs 2.95 

131983 D11930 Hs.3592 ESTs 2.95 

15 111221 N68869 Hs,15119 ESTs 2.936 

113820 T93795 Hs.17252 EST 2.917 i 

105220 AA210695 Hs.17212 ESTs 2.917 

123234 AA490227 Hs. 105252 ESTs 2.904 

125250 W87465 Hs.222926 ESTs; Weakly similar to D2Q92.2 [C.elegans] 2.9 ^ 

20 116196 AA465160 Hs.63386 ESTs 2.9 

122100 AA432243 Hs.41086 ESTs; Weakly simitar to OXYSTEROL-BINDING PROTEIN [H.sapiens] 2.896 

111712 R22905 Hs.113716 ESTs . 2.895 

126589 W78107 Hs.1 87698 ESTs; Weakly similar to Ver140wp[S.cerevisiae] ~" 2.895 

111132 N64378 Hs.13149 ESTs; Highly similar to unknown function [H.sapiens] 2.894 

25 115307 AA280300 Hs.1 91 346 ESTs 2.886 

108989 AA152263 Hs, 18827 KIAA0849 protein 2.883 

129486 H03686 Hs.220689 Ras-GTPase-activatEng protein SH3-domain-binding protein 2.879 

II9805 W73788 Hs.43213 ESTs 2.875 

125721 R59881 Hs.7503 ESTs 2.871 

30 103704 AA028171 Hs.153688 ESTs 2.B68 

128420 AI088155 Hs.14146 ESTs; Weakly similar to unknown [H.sapiens] 2.866 

120571 AA280738 Hs.128679 ESTs ZM 

'123059 AA482019 Hs.238202 EST 2.86 

129462 D84239 Hs.1 11732 IgG Fc binding protein 2.B56 

35 125166 W45491 Hs.172609 nucleobindin 1 2.854 

125992 W01626 za36e07.r1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 2352 

109431 AA227972 Hs.43635 ESTs 2.85 

105077 AA142919 Hs.5558 ESTs 2.847 

131388 R34531 Hs.92200 KIAA0480 gene product 2.846 

40 121080 A*398720 Hs.177953 ESTs 2.838 

112575 R73816 Hs.17385 ESTs 2.836 

130244 R26206 Hs.153293 KIAA0701 protein 2325 

134698 AA427783 Hs.77910 3^yaVoxy-^ethylgIutary^ 2316 

116355 AA504356 Hs38650 ESTs 2.813 

45 1 15316, AA280627 Hs37846 ESTs 2.806 

129677 U48736 Hs.198891 sertae/thfeonlne-protetn kinase PRP4 homolog 23 

130971 R20332 Hs.28707 signal sequence receptor; gamma (translocon-associatBd protein garnma) 2.799 

115054 AA252863 Hs.87729 ESTs 2.795 

130285 -AA063546 HsJZQ2968 ESTs 2.792 

50 124308 H93575 Hs.227146 Homo sapiens mRNA; cDNA DKFZp564J142 (from clone DKFZp564J142) 2.783 

125502 AA732329 Hs.191959 ESTs 2.778 

114800 AA159825 Hs.131887 ESTs; Weakly similar to ORF YNL227c [S.cerevlsiae] 2.768 

128625 , AA242816 Hs.102652 ESTs; Weakly similar to KIAA0437 [H^apiens] - 2.766 

130159 H51098 Hs.151310 PDZ domain protein (Drosophlla inaD-like) 2.75 

55 107127 AA620504 Hs£2119 ESTs 2.742 

113547 T90746 Hs.15233 ESTs 2.734 

104639 AA004622 Hs.18214 ESTs 2.727 

127609 AA622559 Hs.150318 ESTs 2.726 

106922 AA490984 Hs.10056 ESTs 2.725 

60 124825 R52088 yg85c3.s1 Soares infant brain 1NIB Homo sapiens cDNA clone 2.725 

124333 H98683 Hs.154054 ESTs 2.708 

117634 N36421 Hs.1 07854 ESTs; Weakly similar to SODIUM- AND CHLORIDE-DEPENDENTGLYCINE 

TRANSP 2.706 

101609 M54927 Hs.1787 proteollptd protein 1 (Pelizaeus-Merzbacher disease; spastic paraplegia 2; 

65 • uncomplicated) 2704 

117142 'H96908 Hs42251 ESTs 2.7 

112602 R79147 Hs.203365 ESTs 2.695 

106828 AA481505 Hs.13797 ESTs S- 68 

124377 N25996 Hs.179833 ESTs ^ m 



214 



WO 02/30268 PCTTUSO 1/32045 



101026 J04970 carboxypeptidase M 2.675 

124560 N66393 * Hs.102754 ESTs • 2.675 

124066 H02494 Hs.101615 ESTs 2.671 

130281 R12777. Hs.15395 ESTs; Weakly similar to ARGINYL-TRNA SYNTHETASE [H^apiens] 2.66 

S 110949 N49602 Hs. 13303 ESTs 2.65 

111031 N54839 Hs.221085 ESTs; Highly similar to mediator [H^apiens] 2.633 

121770 AM21714 Hs.11469 KIAA0896 protein 2.63 

134132 U32519 Hs.220689 Ras-GTPase-activating protein SH3-domain-binding protein 2.626 

112424 R62452 Hs. 191265 ESTs 2.625 

10 122544 AA451679 Hs.194410 ESTs 2.625 

134425 X90568 Hs. 172 0O4 titin 2.624 

111114 N63391 Hs.9238 ESTs 2.619 

116119 AA459242 Hs.44445 ESTs; Weakly similar to Kelch motif contaMng protein [H.saplens] 2.615 

112079 R44164 H&23014 ESTs 2.6 

15 123033 AA481271 Hs.193945 ESTs 2591 

124196 H52617 Hs.144167 ESTs 2586 / 

125873 H14437 yf25a04.r1 Scares breast 3NbHBst Homo sapiens cONA done 258 

117684 N40184 Hs45050 ESTs 2575 * 

134938 D30037 Hs.168326 phosphotidyiinositol transfer protein; beta 2575 

20 131822 AA215647 Hs.200332 ESTs 2568 ? 

135185 U71203 Hs£6038 Rte(DrosophilaHike; expressed in many tissues 2564. 

117690 N40467' Hs.93834 ESTs ^ 2557 

118807 N78582 Hs50732 protein kinase; AMP-activated; beta 2 non-catalytic subunit 2552 . 

121369 AA405657 Hs.128791 Human DNA sequence from clone 967N21 on chromosome 20p12.3-13. Contains 255 

25 114860 AA235112 Hs.1 06227 ESTs; Moderately similar to similar to murine RNA-binding protein [H.saptens] 2549 

121857 AA426017 Hs.62694 ESTs; Highly sImBar to DNA-REPA1R PROTEIN COMPLEMENTING 2548 

110190 H20560 Hs.244624 ESTs 2548 

132573 AA045333 Hs51743 ESTs; Weakly similar to I! ALU SUBFAMILY SB2 WARNING ENTRY II [H.saptens] 2.542 

-109706 F09729 Hs.12780 ESTs 2537* 

30 135109 AA410391 Hs.94592 (dotho 2525 

132810 R37027 Hs5737 KIAA0475 gene product 2525 

124879 R73588 Hs.101533 ESTs 2525 

103840 AA174190 Hs50932 ESTs 2525 

119068 R22196 Hs.34492 ESTs 2519 

35 114833 AA234362 Hs.87310 ESTs; Moderately similar to CGI-66 protein [H.sapiens] 2507 

112998 T23555 H 3. 103283 ESTs 25 

123312 AA4S6258 Hs.99601 ESTs 2.499 

121873 AA426270 H 5. 14569 6 splicing factor (CC 1.3) 2.491 

123321 AA496884 Hs.23972 ESTs 2.491 

40 107760 AA018042 Hs35078 EST 2.483 

102580 U60808 Hs.152981 CDP-diacyJglycsroI synthase (phosphaiidatB cytidylyftransferase) 1 £481 

103053 X56741 Hs5947 mel transforming oncogene (derived from cell line NK14)- RAB8 homotog 2.475 

124756 R38100 Hs.106294 ESTs 2.475 

112936 T15665 Hs5185 ESTs; Weakiy similar to BcDNA,GH12174 p.melanogaster] 2.475 

45 125178 W^02 Hs.125731 ESTs 2.475 

112423 R62447 Hs£2123 ESTs 2.471 

123515 AA600323 Hs.112535 EST 2.462 

102842 U95020 H&21903 calcium channel voltage-dependent; beta 4 subunit 2.457 

102400 U42390 Hs.171957 triple functional domain (PTPRF Interacting) 2.455 

50 113187 T55056 Hs.9992 ESTs - 2.452 

131687 L11066 Hs.3069 heat shock 70kD protein 9B (mortaiin-2) 2.448 

115314 AA280583 Hs556501 ESTs 2.437 

128211 AI206427 Hs. 168707 ESTs; High ly similar to Ran-bindlng protein 2 [Ksaplens] 2.43 

134281 L1 1005 Hs51047 aldehyde oxidase 1 2.425 

55 115985 AA447709 Hs. 132094 ESTs; Moderately similar to putative transcription factor CA150 [H.sapiens] 2.425 

111348 N90041 Hs.9585 ESTs 2.418 

129430 AA258842 Hs, 197877 Homo sapiens clone 23777 putative transmembrane GTPase mRNA; partial cds 2.418 

133863 C13990 Hs.76930 synucleln; alpha (non A4 component of amyloid precursor) 2.417 

111164 N66857 Hs.14808 ESTs; Weakly similar to II ALU CLASS C WARNING ENTRY 11 [H.sapiens] 2.416 

60 132143 AA257058 Hs.7972 KIAA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmin 1 2.408 

114219 Z39451 Hs.27389 ESTs 2.406 

117101 H94043 Hs.24341 DKFZP5861 141 9 protein 2.403 

125433 /^034325 Hs54320 ESTs 2.4 

65 111099 N62506 H&21958 ESTs 2.4 

120323 AA195405 Hs.1 10347 Homo sapiens mRNA for alpha integrin binding protein 80; partial 2.397 

118624 N69998 Hs21601 ESTs 2.394 

123570 AA608955 Hs.1 09653 ESTs 2389 

123562 AA608893 Hs.1 90065 ESTs 2388 
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131546 AA262821 Hs28578 musdebllnd prosophDaHte 2.385 

103143 X66141 Hs.75535 myosin; Oght polypeptide 2; regulatory; cardiac; slow 2.384 

123645 AA609310 H 3.1 88691 ESTs 2.383 

130123 AA001835 Hs. 150390 zinc finger protein 262 2379 

131682 AM28368 Hs.30654 ESTs 2378 

115909 AA436666 Hs.59761 ESTs 2.375 

125168 W45574 Hs252497 ESTs 2.372 

123973 C14605 Hs. 182 151 ESTs 2.381 

1 35197 U76456 Homo sapiens tissue Inhibitor of metatloprotelnase 4 mRNA, complete cds 2.357 

118689 N71545 Hs.184544 ESTs 2357 

107734 AA016225 Hs.93386 ESTs 2.354 

124590 N69220 Ks.41381 ESTs; Weakly similar to ubiqurtin hydrolyzing enzyme I [H.saplens] 235 

111163 N66850 Hs.17606 ESTs 2348 

112349 R58877 Hs22665 ESTs; Moderately similar to dJ83L6.1 [H.saplens] 2345 

128076 AA262179 Hs.1 69343 ESTs 2345 

134238 R81509 Hs.184571 splicing factor; arginirw/serine-rich 1 1 2341 

116768 H13260 Hs.95097 ESTs 2338 v 

106331 AA436853 Hs.34795 ESTs 2333 

129003 AA443752 Hs.10784 ESTs 2332 

132368 AA599814 Hs.46837 ESTs; Weakly simitar to cONA EST yk289g5.5 comes from this gens [C.eJegans] 2332 

124697 R06273 Hs.1 86467 ESTs; Modly smlr to II ALU SUBFAMILY J WARNING ENTRY I! [H^aplens] 2322 

120273 AA176688 Hs.221139 ESTs 2313 

127110 AA304993 Hs.100861 ESTs; Weakly similar to p60 katanin [H.sapiens] * 2.307 

105450 AA252621 Hs.93842 ESTs 2.301 

119819 W74371 Hs.58383 ESTs 2297 

102302 U33052 Hs.69171 protein kinase Olike 2 2.288 

130598 N74353 Hs.16475 ESTs 2582 

114161 Z38904 Hs22385 ESTs; Weakly similar to KIAA0970 protein [H^apiens] 2278 

130542 U64675 Human sperm membrane protein BS-63 mRNA, complete cds 2277 

104491 N71513 Hs.39328 ESTs 2275 

1 16988 H82527 ys69e 12.31 Soares retina N2b4HR Homo sapiens cDNA clone 2275 

126823 AA370120 Hs.7870 ESTs; Weakly similar to Ylr350wp [Sxerevlsiae] 2273 

108800 AA129731 Hs.90424 ESTs 2273 

101310 141607 Hs.934 gtucosamlnyl (N-acetyl) transferase 2; ^branching enzyme 2269 

126842 W19498 Hs21085 ESTs 2255 

127251 AA936428 Hs. 128638 ESTs 2251 

124647 N91947 Hs.125033 ESTs 2249 

127112 AI143906 Hs.125103 ESTs 2247 

101973 S82597 Hs.80120 UDP-N-acetyt-alpha-0^atactosamIne:polypeptide 2246 

120999 AA398302 Hs. 127437 ESTs 2245 

130225 AA599583 H s,15299 HMBA-fnductbte 2243 

119980 W88678 Hs249247 heterogeneous nuclear protein similar to rat helix destabilizing protein 2243 

124222 H61053 Hs222844 ESTs 224 

129199 H90914 Hs.128629 ESTs 2236 

106802 AA479101 Hs.16570 ESTs; Weakly similar to II ALU SUBFAMILY SQ WARNING ENTRY II [H^aplens] 2231 

126160 N90960 Hs247277 ESTs; Weakly similar to transformation-related protein [H.saplens] 2229 

104627 AA001976 Hs.18603 ESTs 2228 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from clone 0KFZp564C053) 2226 

113096 T40927 Hs3345 ESTs 2225 

135336 AA452822 Hs39027 ESTs 2225 

135344 R62976 Hs.168491 ESTs; Moderately similar to TRFUnteracting ankyrin-related 2225 

126156 AA508354 Hs.1 18448 ESTs; Moderately similar to AKT3 protein kinase [H.sapiens] 2222 

128885 AA397841 Hs.180141 cofilin 2 (muscta) . 2218 

107900 AA026385 Hs.176600 ESTs; Moderately similar to I! ALU SUBFAMILY SB2 WARNING 2217 

114481 AA033562 Hs.151572 ESTs 2212 

109292 AA199828 Hs.1 88662 ESTs 2212 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2209 

132932 T15482 Hs.6093 ESTs 2204 

127392 AA262728 Hs.14898 Homo sapiens done 24590 mRNA sequence 2204 

104641 AA004652 Hs.1 8564 ESTs 22 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93562 Hs.162395 proline synthetase cc-transcribed {bacterial homolog) 2.193 

133601 S95936 Hs.75155 transferrin 2.193 

1 19904 W85709 Hs.128927 ESTs; WeaWy similar to II ALU SUBFAMILY SP WARNING ENTRY I! [H.sapiens) 2.192 

100348 064109 Hs.4994 transducer of ERBB2; 2 (TOB2) 2.185 

126871 AA351779 Hs200334 ESTs 2.18 

127793 AI298835 Hs.30445 ESTs; Weakly similar to transcription regulator Staf-50 [H.sapiens] 2.178 

105149 AA169253 Hs.8958 ESTs 2.177 

121367 AA405648 zw39g8.s1 Soares_totaUetus_Nb2HF8_9w H sapiens cDNA clone IMAGE:772478 2.177 
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111836 R36228 Hs.25119 ESTs 2.175 

133394 R16759 Hs.237225 ribosomal protein S5 pseudogene 1 2.175 

123207 AA489697 Hs.145053 ESTs 2.175 

129801 F11087 Hs.239668 ESTs 2.175 

S 103393 X94612 Hs.41749 protein kinase; cGMP-dependent; type II 2.161 

132415 AA043223 Hs.4815 nudix (nucleoside diphosphate linked moiety XHype motif 3 2.157 

106369 AA443828 Hs.25324 ESTs 2.157 

122983 AA478446 Hs.69559 KIAA10S6 protein 2.156 

133473 M19309 Hs.73960 troponin T1 ; skeletal; stow 2.155 

10 134257 C06270 Hs.8078 Homo sapiens mRNA;cDNADKFZp586L081 (from clone DKFZp586L081) 2.155 

135156 AA056012 Hs.9552 binder of Art Two 2,151 

104055 AA393755 Hs.1 17211 ESTs; Highly similar to CG1-62 protein [Haptens] 2.15 

102313 U33921 HSU33921 Clontech adult lung cONA Iforary (HL1158a) Homo sapiens cDNA 2.15 

109788 F10638 Hs.12432 Homo sapiens done 24407 mRNA sequence 2.15 ' 

15 103507 Y10032 Hs.159640 serurtfglucooorticoJd regulated kinase 2.15 

116000 AA448710 Hs.41327 ESTs 2.15 

10585B AA399164 Hs£27676 ESTs; Moderately similar to II ALU SUBFAMILY SQ 2.137 

103153 X66534 Hs.75295 guanylate cyclase 1; soluble; alpha 3 2.137 

126202 AA652238 Hs.199726 ESTs 2.135 

20 115955 AA446121 Hs.44198 Homo sapiens BAC clone RG054D04 from 7o31 2.134 

104164 AA458770 Hs.27023 KIAA0917 protein 2.132 

108692 AA121270 Hs.82960 ESTs ^ 2.128 

122878 AA465341 Hs.99640 ESTs 2.126 

134771 L13939 Hs.89576 adaptor-related protein complex 1; beta 1 subunit 2.125 

25 104298 D31120 Hs.40368 adaptor-related protein complex 1; sigma 2 subunit 2.125 

104840 AA039595 Hs.42458 Homo sapiens mRNA; cDNA 0KFZp586C1817 (from clone DKFZp586C1817) 2.125 

122180 AA435798 Hs.98835 ESTs; Moderately similar to putative ring zinc finger protein 2.125 

131012 H01992 H&202949 KIAA1 102 protein 2.125 

134092 H17490 Hs.7905 ESTs; Highly similar to sorting nextn 9 [H.sapiens] 2.123 

30 118617 N69666 Hs.183413 ESTs; Modtty smlr to I! ALU SUBFAMILY J WARNING ENTRY 1! [H.sapiens] 2.123 

107155 AA621202 Hs.7946 DKFZP586D1519 protein 2.12 

130925 N71935 Hs. 169378 multiple POZ domain protein ' 2.12 

1351 67 U63717 Hs.95821 osteoclast stimulating factor 1 2.1 18 

105952 AA405263 Hs.181400 ESTs 2.109 

35 110308 H38148 Hs.32775 ESTs 2.108 

116368 AA521186 Hs.94217 ESTs 2.107 

132939 U76189 Hs.61152 exostoses (muttipteHke 2 2.102 

117881 N50073 Hs.84926 ESTs; Highly similar to B-IND1 protein [M.muscutus] 2.1 

121723 AA419622 Hs. 104800 ESTs; Weakly similar to Mouse 19.5 mRNA; complete cds [M.musculus] 2.098 

40 103500 Y09443 Hs.22580 alkylgtycerone phosphate synthase 2.094 

121429 AA406293 Hs.193498 ESTs 2.093 

134632 AA398710 Hs.174139 chloride channel 3 2.091 

129785 F10980 Hs. 184780 ESTs 2.09 

111065 N58193 Hs.18740 ESTs; Weakfy similar to 1-evldence 2.089 

45 114710 AA129931 Hs.79081 protein phosphatase 1; catalytic subunit; gamma isofbrm 2.083 

132711 N73702 Hs.238927 ESTs 2 083 

133377 R05490 Hs.7239 SEC24 (S. cerevisiae) related gene family; member B 2.079 

124773 R40923 Hs.106604 ESTs 2,078 

117759 N47587 Hs.97345 ESTs; Weakly similar to TROPOMODUUN [H.saplens] 2.076 

50 127386 AI457411 Hs.106728 ESTs 2.076 

101167 L15309 Hs.193677 zinc finger protein 141 (clone pHZ-44) 2.075 

109597 F02582 Hs.14474 ESTs 2.074 

124390 N29325 Hs.7535 ESTs; Highly simitar to COBW-Bke placental protein [Hisaplens] 2.07 

116225 AA478609 Hs.47278 Human Chromosome 16 BAC clone CIT987SK-A-735G6 2.07 

55 131243 R16667 Hs.24752 spectrin SH3 domain binding protein 1 2.069 

130557 T90830 Hs.15981 ESTs; Weakly similar to Bne-1 protein 0RF2 [Ksaptens] 2.067 

134103 D14826 Hs.1 55924 cAMP responsive element modulator 2.064 

108833 AA131886 Hs.61661 ESTs; Weakly similar to DY3.6 [Celegans] 2.063 

112286 R53765 Hs.158135 KIAA0981 protein 2.063 

60 125624 AA165411 zq49a01.M Stratagene hNT neuron (#937233) Homo sapiens cDNA done 2.061 

124612 N72200 Hs.13913 ESTs 2.058 

116335 AA495830 Hs.87013 ESTs 2.057 

112248 R51361 Hs.23423 ESTs 2.056 

115789 AA424754 Hs.43149 ESTs 2.056 

65 107029 AA599219 Hs.187492 ESTs; Weakly similar to ALR [Rsapiens] 2,056 , 

110294 H30270 Hs.165062 ESTs 2.054 

120532 AA262354 Hs.1 86648 ESTs 2.054 

118180 N59249 Hs.48349 ESTs 2.052 

132018 AA293194 Hs.3737 ESTs 2.052 
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132617 AA171913 


Hs.5338 carbonic anhydrase XII 


2.05 


131528 N36167 


Hsi8274 ESTs 


2.05 


113254 T64438 


Hs.1 1449 DKFZP5640123 protein 


2.05 


122785 AA459978 


Ks.99508 ESTs 


2.05 


107203 D20426 


Hs.5656 EST 


2.05 


105713 AA291321 


Hs.184319 ESTs; Moderately similar to KIAA1008 protein [H^apiens] 


2.046 


129385 D82675 


Hs.1 1 0950 Homo saplBns clone 25007 mRNA sequence 


2.042 


119116 R43845 


Hs.64595 DKFZP556E2346 protein 


2.04 


116405 AA600253 


Hs.55601 EST s; Highly similar to host ceil factor 2 [H .sapiens] 


2.04 


125924 AA526849 


Hs,82109 syndecan 1 


2.038 


105599 AA279442 


Hs.1 43460 protein kinase C; nu 


2.037 


119741 W70205 


Hs.43670 klnesin family member 3A 


2.037 


101449 M21494 


Hs.1 18843 creatine kinase; muscle 


2.036 


107109 AA609943 


Hs.32793 ESTs 


2.034 


117040 K89112 


yw25e5.s1 Morton Fetal Cochlea Homo sapiens cONA clone IMAGE25328 


2.034 


132906 AA1 42857 


Hs.234896 EST s; Highly similar to gemintn [H.saptensj 


2.031 


105479 AA255546 


Hs.23467 ESTs 


2.027 


102031 U04898 


Hs.2156 RAR-related orphan receptor A 


2.027 


119846 W80363 


Hs.58446 ESTs 


2.024 


124809 R46482 


Hs.1 06875 ESTs 


2.024 


130286 AA041548 


Hs.154023 KIAA0573 protein 


2.023 


124457 N50114 


Hs.128704 ESTs 


2.017 


125144 W37999 


H&24336 ESTs 


2.017 


1205B1 AA281257 


Hs 125868 ESTs 


2.014 


104931 AA0ft97ai 


Hs 108316 fhvrafd hnrmona receritor-assoclatad nratefn* 150 kDa suhurut 


2.012 


120548 AA978846 


Hs 187634 ESTs 


2.01 1 




Hs 30567 ESTs 


2,011 


123079 AA4A5fld1 


Hs 104308 ESTs 


2.009 


19364A AAA0Q393 


Hs11268Q ESTs 

no* 1 I£w09 ww 1 0 


2.008 


11RR75 H677AQ 

1 twO/ 9 nOf l *KJ 


He: 161099 EST 
no. i o i U££ co i 


2.003 


103179 X6939B 


Hs.82685 CD47 antigen (Rh-related antigen; integrln-associated signal transducer) 


1995 


IUO*f/D IU//39 


Hs.38991 S100 calcium-binding protein A2 


1.995 


111007 W(v337fl 

i\\wf nooo/o 


Hs.22543 ESTs 


1.995 


190470 A AO m 707 


zs11f3.s1 NCLCQAP_GCB1 Homo sapiens cDNA done 


1.989 




Hs.26040 ESTs; Weakly similar to fatty acid omega-hydroxylase [H.sapiens] 


1.989 


11A19T 73ftfi59 


Hs.106961 ESTs; Wealdy similar to TYL [H.sapiens] 


1.988 


HOQQftO A 

MM 10 I wO 


Hs.1 29872 sperm surface protein 


1.988 


IfiftCOA A AAQARAfl 
IU00£U n/vWOOUO 


ESTs 


1.988 


108933 AA1 47224 


Hs.71814 ESTs 


1.986 


1 ae one AA401A33 

lUOcUO HM4U10O0 


Hs.22380 ESTs 


1.982 


iftQAOQ AA1C7Q11 


Hs.72200 ESTs 


1.982 


118470 N66769 


Hs^2781 ESTs 


1.975 


11535ft AA9fl1ftftfi 
1 I DODO r\f\CO I0OO 


Hs.88923 ESTs 


1.975 


115257 AA278060 


Hs.193516 B^cellCLUIymphomalO 


1.974 


1 9ftft7Q A A71 Q77fi 


2h38g04.s1 Soaresj)lneaU»and_N3HPQ Homo sapiens cDNA done IMAGE:414390 1 .974 




Hs.26966 ESTs 


1.973 


1 971 1 1 A AA0579A 


Hs.220509 ESTs 


1.969 


101266 L3664S 


Hs.73964 EphA4 


1.966 




Hs.30340 ESTs 


1.965 




Hs.126083 ESTs 


1.962 




Hs.169882 ESTs 


1.961 




Hs.190504 ESTs 


1.959 


1 0OfiQO A AAGS157Q 


Hs.12017 KIAA0439 protein; homolog of yeast ubiquitioprotein ligase Rsp5 


1.956 


10549* AA95119Q 


Hs.24416 ESTs 


1.953 


134740 L37362 


Hs.89455 opioid receptor; kappa 1 


1.95 


109324 AA2107CO 


Hs.86405 Homo sapiens mRNA; cDNA DKFZp564P056 (from clone DKFZp564P056) 


1.95 


124303 H93043 


Hs.107070 ESTs 


1.95 


102337 U36922 


Human fork head domain protein (FKHR) mRNA, 3' end 


1.948 


109441 AA2281C0 


Hs.86998 nuclear (actor of activated T-cetls 5 


1.945 


127364 AA179573 


Hs.90061 progesterone binding protein 


1.942 


105255 AA227498 


Ks.3623 ESTs 


1.942 


130672 L18783 


Hs.177 phosphatidyiinositol glycan; class H 


1.942 


104301 D45332 


Hs.6783 ESTs 


1.94 


132442 R62589 


Hs.167419 ESTs 


1.939 


105519 AA258063 


Hs.23438 ESTs 


1.937 


132902 AA4909B9 


Hs.168147 ESTs 


1.936 


118873 N89881 


Hs.44577 ESTs 


1.936 


114124 Z38595 


Hs.125019 ESTs; Highly similar to KIAA0886 protein [H^apiens] 


1.934 


115075 AA255486 


Hs.88045 ESTs 


1.933 
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110895 H93483 Hs.124777 ESTs 1.931 

105360 AA236209 Hs. 187626 ESTs 1.931 

124998 T56013 Hs.77910 3-hydrox^3-methyfglularyl-Co0nzyme A synthase 1 (soluble) 1.929 

121816 AA424814 Hs.187509 ESTs 1.927 

5 111717 R23241 Hs.110776 STAT induced STAT lnhEbitor-2 1.925 

128874 H06245 Hs.106801 ESTs 1.925 

109391 AA219699 Hs. 184245 KIAA0929 protein Msx2 interacting nuclear target (MINT) homolog 1.913 

126129 H82165 Hs.40334 ESTs 1.911 

115553 AA369027 Hs.71414 ESTs 1.905 

10 113811 W44928 Hs.4878 ESTs 1.905 
108345 AA070906 zm66d1,s1 Stratagene neuroepithelium (#937231) Homo sapiens cDNActone 1.904 

120472 AA251875 Hs.1 04472 ESTs; Weakly similar to Gag-Pol polyproteln [M.musculus] 1.903 

116602 D60063 Hs.241673 EST 1.901 

121 121 AA399371 Hs.189095 ESTs; Weakly Mar to zinc finger protein SALL1 [H.sapiens] 1 .9 

IS 125330 AA401804 Hs.114574 ESTs 1.896 

130095 F01831 Hs.14838 ESTs 1.894 

119782 W72982 Hs.58262 ESTs 1.894 

104115 AA428090 Hs.26102 ESTs 1.893 

131313 C17938 Hs.22370 Homo sapiens mRNA; cDNA DKFZp564O0122 (from clone DKFZp564O0122) 1.891 _ 

20 105583 AA278907 Hs.24549 ESTs 1.891 

122825 AA461195 Hs.99580 ESTs 1.8B7 

119495 W35390 Hs.55533 ESTs 1,886 

130309 AA134289 Hs.15423 Homo sapiens BAC clone RG114B19 from 7q31.1 1.886 ! 

125628 AA418069 Hs.241493 natural killer-tumor recognition sequence 1.886 

25 110611 H66947 Hs.14671 ESTs; Highly similar to gene ERCC5 protein [H^apiens] 1.885 

117301 N22569 Hs.43215 ESTs 1.884 

131406 N92239 Hs.26471 Writ Inhibitory factor-1 1.881 

126428 AA013312 Hs.64988 ESTs 1.881 

120285 AA182882 Hs.111110 titin-cap (telethonln) 1.878 

30 112724 R91753 Hs.17757 ESTs 1.878 

103121 X63679 Hs.4147 translocating chain-associating membrane protein 1.875 

124381 N26765 Hs.109008 ESTs 1.875 

117226 N20468 Hs.1 77322 ESTs; Weakly similar to putative p150 [H^apiens] 1.875 

105610 AA279991 Hs. 124691 ESTs; Weakly similar to trithorax homologue 2 [H.sapfens] 1.875 

35 111229 N69113 Hs.110855 ESTs 1.875 

120627 AA285079 Hs.190474 ESTs 1.873 

107048 AA600012 Hs.10669 ESTs; Moderately simitar to KIMMOO [H.sapiens] 1.872 

104041 AA381902 Hs.1 971 14 RNA binding protein 1.872 

115162 AA258366 Hs227806 ras GTPase activating protein-like 1.872 

40 102239 U26726 Hs.1 376 hydroxysteroid(11-beta) dehydrogenase 2 1.87 

100043 M10098 AFFX control: 18S ribosomal RNA 1.868 

120296 AA191353 Hs.22385 ESTs; Weakly similar to KIAA0970 protein [Hxapiens] 1.667 

129011 S72869 Hs.107932 DNA segment; single copy; probe pH4 (transforming sequence; thyrokM; 1.867 

134851 R44479 Hs.90232 K1AA0552 gene product 1.866 

45 117392 N26175 Hs.93405 ESTs 1.864 

114530 AA053027 Hs.191797 ESTs 1.863 

123541 AA608794 Hs.1 12592 ESTs 1.863 

124890 R78618 Hs.34145 ESTs; Weakty similar to RAS-RELATED PROTEIN RAB-8 Jisaptens] 1.862 

105299 AA233511 Hs.194720 ATP-binding cassette; sub-family G (WHITE); member2 1.861 

50 103560 220656 Hs. 182787 myosin; heavy potypept 6; <»rdiacmusde;a^ 1.861 

113073 T33637 Hs.6841 ESTs 1.86 

120407 AA235040 Hs.107283 ESTs 1.859 

103892 AA243523 Hs.17155 ESTs - 1.858 

123795 AA620381 Hs.70488 ESTs 1.857 

55 108524 AA084323 Hs.68138 ESTs 1.857 

113953 W85812 Hs.1B7554 ESTs 1.856 

110721 H97678 Hs.31319 ESTs 1-B56 

129426 AA412087 Hs.168272 EST; Highly smlr to prot Inhibitor of activated STAT prot PIASx-alpha [H.sapiens] 1.853 

112102 R44840 Hs.21303 ESTs J 1-852 

60 118502 N67317 Hs.50150 ESTs o 1852 

107619 AA004955 Hs.60015 ESTs 1-851 

100438 D87446 Hs.75912 KIAA0257 protein 1.B5 

120652 AA287312 Hs.191648 ESTs 1.85 

121643 AA417078 Hs.193767 ESTs 1-843 

65 117387 N26011 Hs.53810 ESTs 1-843 

132084 Y12394 Hs.38B6 karyophertn alpha 3 (Importin alpha 4) 1-843 

124449 N48593 Hs.121820 ESTs 1-841 

120263 AA173440 Hs.1 93919 ESTs 1.838 

127226 AA731036 Hs.3463 ribosomal protein S23 1-838 
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111837 R36447 Hs.24453 ESTs 1.835 

128727 M64174 Hs.50651 Janus kinase 1 (a protein tyrosine kinase) 1.834 

114439 AA018937 Hs, 128629 ESTs 1.833 

102332 U35637 Human nebuBn mRNA, partial cds 1.83 

5 126579 W72979 Hs.146082 ESTs 1.83 

102341 U37122 Hs.8110 adducin 3 (gamma) 1.83 

114246 Z39848 Hs.12079 ESTs 1.828 

131757 D17532 Hs316 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 6 (RNA helicase; 64kD) 1.823 

108904 AA136521 Hs.71148 ESTs; Weakly similar to putative p150 [H.sapiens] 1.823 

10 115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from clone DKFZp564C053) 1323 

131957 AA609008 Hs.183232 ESTs 1322 

100131 D12485 Hs.11951 phosphodiesterase l/nudeotide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1.822 

124163 H30539 Hs.189838 ESTs 1.821 

IS 118204 N59859 Hs.48443 ESTs 1.821 

107727 AA016021 Hs.173091 DKFZP434K151 protein 132 

100357 D78156 Hs .24 1548 R AS p21 protein activator 2 1.82 

116295 AA489018 Hs.91216 ESTs; Highly similar to partial CDS; human putative tumor suppressor [H^apiens] 132 

124833 R54112 Hs.128697 ESTs 1317 

20 122587 AA453255 Hs.6968 ESTs 1317 

114359 241589 Hs.153483 ESTs; Moderately similar to H1 chloride channel [Haptens] 1315 

111289 N72253 Hs.238246 ESTs 1313 > 

110826 N30068 Hs.15347 ESTs 1312 

104106 AA422123 Hs.42457 ESTs 1311 

25 130043 AA055404 Hs.193953 ESTs; Weakly similar to 1) ALU SUBFAMILY J WARNING ENTRY I! [H.saplens] 1553 

115864 AA432080 Hs31200 ESTs 131 

129737 AA056140 Hs.122684 ESTs 131 

124477 N53158 Hs.102682 ESTs 1309 

100782 HG3740-HT4010 Basic Transcription Factor 2, 34 KdaSubunft 1306 . 

30 106101 AA421053 Hs.34395 ESTs 1306 

115479 AA287696 zs52h093l NC!_CGAP_GC81 H sapiens cDNA clone IMAGE:701 153 1304 

116104 AA456635 Hs.78524 ESTs 1304 

114173 Z39050 Hs.21963 ESTs 1304 

132632 N59764 Hs.5398 guanine-monophosphate synthetase 1303 

35 119135 R49548 Hs. 169581 death effector domain-containing 1.802 

131559 N91087 Hs.28728 ESTs; Weakly similar to F55A123 [C.elegans] 1301 

126922 AA177138 Hs.161671 ESTs 13 

117375 N25427 Hs.108812 ESTs 13 

103571 Z25535 Hs.211608 nucleoporin 153kD 13 

40 105978 AA406367 Hs,15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 choOne kinase 1.798 

105777 AA348412 Hs.23096 ESTs 1.797 

110166 H19480 Hs.174309 ESTs 1.796 

45 105038 AA130273 Hs.7584 ESTs; Weakly similar to hypothetical protein; similar to [Ksaplens] 1.798 

105427 AA251330 Hs.26248 ESTs 1.795 

115278 AA279757 Hs.67466 ESTs; Weakly similar to BACN32G1 1 .d [D melanogaster] 1.794 

133104 L13698 Hs35029 growth arrest-specific 1 1.794 

131170 N48674 Hs.23796 Human DNA sequence from clone 1052M9 on chromosome Xq25. Contains the 1.792 

50 100136 D13540 Hs.22868 protein tyrosine phosphatase; non-receptor type 11 1-791 

127263 AA331157 EST350G5 Embryo, 6 week, subtracted (total cDNA) I Homo sapiens cDNA 1.79 

114157 238878 Hs.24979 ESTs 1.79 

125601 A1096717 Hs.247043 KIAA0525 protein * 1.788 

118472 N66818 Hs.42179 ESTs 1.787 

55 112456 R63925 Hs.28464 ESTs 1.787 

130236 N69682 Hs.51957 SC35-Interacting protein 1 1.788 

133297 AA600057 Hs.70266 KIAA0905 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 Hs38176 KIAA0606 protein; SCN Circadian Oscillatory Protein (SCOP) 1.783 

60 129093 AA262710 Hs.108614 KIAA0627 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 Hs.22857 chord dornaln-contalning protein 1 1.781 

100598 HG2463-HT2559 Guanine Nudeotide-Binding Protein G25k 1.779 

104038 AA374532 EST86676 HSC172 cells I Homo sapiens cDNA 5' end, mRNA sequence 1 .778 

65 122235 AA436475 Hs.190104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; Na*yK+ transporting; beta 3 polypeptide 1.776 

107601 AA004636 Hs3Q223 ESTs 1.776 

131467 W68255 Hs.27194 DKFZP434K171 protein 1.776 

118449 N66413 Hs.172466 ESTs; Weakly similar to KIAA0775 protein [Ksaptens) 1.776 
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107969 AA034030 Hs.155212 fnethytmalony) Coenzyme A mutase 1.775 

115527 AA342079 H&252055 ESTs 1.775 

132471 T16305 Hs.49349 beta-site APP-cteaving enzyme 1.775 

105966 AA406105 Hs.5344 adaptor-related protein complex 1; gamma 1 subunit 1.774 

5 127548 AA373091 Hs.93832 Homo sapiens clone 24483 unknown mRNA; partial cds 1.774 

106217 AA428379 Hs.24870 ESTs 1.773 

131214 N26777 Hs.172635 ESTs 1.773 

106295 AA435664 Hs.8583 similar to APOBEC1 1.773 

106328 AA436705 Hs.28020 KIAA0766 gene product 1.772 

10 124661 N93797 Hs.3090 EphB1 1.772 

122988 AA479166 Hs.105633 ESTs 1.772 

115504 AA291948 Hs.42738 ESTs 1.771 

105168 AA180208 Hs. 16606 ESTs; Highly similar to CGI-32 protein [H^apiens] 1.767 

129153 AA186616 Hs.181461 artedne; DrosophOa; homolog of 1.766 

IS 105829 AA398290 Hs.21965 ESTs 1.764 

101811 M86917 Hs.24734 oxysterol binding protein 1.764 

100138 D13628 Hs.2463 angiopoietin 1 1.764 

124704 R07335 ye96d.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done 1.763 

122314 AA442257 Hs.192076 ESTs 1.762 

20 109865 H02566 Hs.1 91268 Homo sapiens mRNA; cDNA DKFZp434N174 (from clone DKF2Jp434N174) 1.761 

106206 AA428069 Hs.89519 KIAA1046 protein 1.758 

107135 AA620782 Hs.23247 ESTs 1.757 

105760 AA338960 Hs.28170 ESTs 1.756 

106288 AA435536 Hs.24336 ESTs 1.756 

25 103968 AA304566 Hs.3542 ESTs 1.756 

129559 AA234945 Hs.11360 ESTs 1.756 

117885 N50112 Hs.47023 ESTs 1.754 

107032 AA599472 Hs.247309 succinate-CoA ligase; GDP-formlng; beta subunit 1.754 

124807 R45963 Hs.233811 ESTs; Weakly similar to ORF2 [M.musculus] 1.753 

30 100276 D42047 Hs.82432 KIAA0089 protein 1.753 

1 10924 N47938 yy84a09.s1 Soares_multlple_sclerosls_2NbHMSP Homo sapiens cDNA clone 1 .751 

133002 AF006082 Hs.62461 ARP2 (actin-related protein 2; yeast) homolog 1.751 

132530 AA455917 Hs.50785 SEC22; vesicle trafficking protein (S. cerevisiae)-Cke 1 1.75 

110759 N21671 Hs.19025 ESTs 1.75 

35 106138 AA424515 Hs,33264 ESTs 1.75 

107348 U43701 Ks, 184776 ribosomal protein L23a 1.75 

115867 AA432162 Hs. 1659 86 DKF2P586B2022 protein 1.749 

135398 AA194075 Hs.99908 nuclear receptor reactivator 4 1.747 

113783 W19222 Hs.7041 ESTs; Weakly similar to 11 ALU SUBFAMILY SQ WARNING ENTRY II [H.sapiens] 1.747 

40 134898 X98330 Hs.90821 ryanodine receptor 2 (cardiac) 1.745 

132215 T10132 Hs.4236 KIAA0476 gene product 1.744 

104229 AB002346 Hs.61289 synaptojanin 2 1.743 

116166 AA461558 Hs.202949 KIAA1 102 protein 1.743 

115433 AA284252 Hs.58372 ESTs 1.743 

45 114908 AA236545 Hs.54973 ESTs 1.742 

127425 AA470941 Hs.143162 ESTs 1.741 

131089 Z38807 Hs.22870 ESTs 1.739 

113498 T88908 Hs.189748 ESTs 1.738 

116710 F10577 Hs.70312 ESTs 1.735 

50 127210 R51476 yg76f04.r1 Soares infant brain 1 NIB Homo sapiens cDNA clone 1.733 

120554 AA279654 Hs.1 94524 ESTs 1.733 

129940 U18242 Hs.1 3572 calcium modulating Ogand 1.732 

117023 H88157 Hs.41105 ESTs * 1.731 

111700 R22212 Hs.23361 ESTs 1.731 

55 118911 H72240 Hs.39292 ESTs; Moderately similar to KIAA0745 protein [H^aplens] 1.731 

106025 AA412063 Hs.6065 ESTs 1.728 

108626 AA101934 Hs.61697 G-protein coupled receptor 1.726 

111614 R12581 Hs.191146 ESTs 1.726 

134134 L76703 Hs.173326 protein phosphatase 2; regulatory subunit B (B56); epsilon isoform 1725 

60 106886 AA489086 Hs.36545 ESTs 1.725 

117998 N52136 Hs.93828 ESTs 1.725 

121204 AA400422 Hs.55896 ESTs 1.725 

121342 AA404995 Hs.1 92430 ESTs 1.725 

131129 R27296 Hs.23240 ESTs 1.725 

65 116235 AA479181 Hs.186726 ESTs 1.725 

102423 U44754 Hs.1 79312 small nuclear RNA activating complex; polypeptide 1; 43kD 1.724 

110273 H29050 Hs.24095 ESTs 1-722 

108758 AA127395 Hs.222414 ESTs 1.722 

110672 H88477 Hs.191178 ESTs 1.721 



■ 

J 



WO 02/30268 PCT/USO 1/32045 



120271 AA176404 Hs.1 1 1092 ESTs; Weakly similar to ZINC FINGER PROTEIN 136 (H.sapiens] 1.72 

100227 D28915 Hs.82316 Interferon-induced; hepatitis C-assoctated mlcrotubular aggregate prot (44kD) 1.719 

129232 W69459 Hs.109655 sex oomb on mkJIeg (DrosophBa)-tike 1 1.719 

134663 W73367 Hs.8750 ESTs 1.717 

104902 AA055475 Hs.104143 clathrin; light polypeptide (Lea) 1.717 

120582 AA281290 Hs.125287 ESTs; Weakly similar to BC331 191 J [H.saplens] 1.717 

134891 F03517 Hs.90787 ESTs 1.716 

106219 AA428567 Hs.26613 Homo sapiens mRNA; cDNA DKFZp586F1323 (from done DKFZp586F1323) 1.715 

116372 AA521311 Hs.13854 ESTs 1.713 

107570 AA001870 Hs.237323 N-ac«tylglucosamine-phosphate mutase; DKFZP434B1 87 protein 1.713 

106198 AA427816 Hs.11803 ESTs 1.712 

125136 W31479 Hs.129051 ESTs 1.712 

104973 AA085676 Hs.6763 KIAA0942 protein 1.712 

128710 J04813 Hs.1041 17 cytochrome P450; subfamily IIIA (nlphedipine oxidase); polypeptide 5 1.711 

123994 D20899 Hs.107127 Homo sapiens mRNA; cDNA DKFZp564G022 (from clone DKFZp564G022) 1.711 

127871 AA766511 Hs,128848 ESTs 1.71 

116089 AA455933 Hs.41324 ESTs 1.709 

123337 AA504153 Hs.132797 ESTs; Weakly similar to ORF YGL050w [S.cerevisiae] 1 .708 

123619 AA609200 Hs.162686 ESTs 1.708 

104781 AA026617 Hs.21610 ESTs; Highly similar to BAU-associated protein 1 [H.sapiens] 1.707 

115114 AA256468 Hs.88148 ESTs 1.705 

117852 N49408 Hs. 138 102 KIAA0853 protein 1.705 

127644 T57570 Hs.77039 ribosomal protein S3A 1.704 

111359 N91273 Hs.27179 ESTs 1.702 

131721 L36644 Hs.31092 EphA5 1.7 

132438 F08925 Hs.48610 ESTs 1.7 
132476 N67192 Hs.49476 Homo sapiens clone TUA8 Crklu-chat region mRNA 1.7 
130990 F02488 Hs.21917 KIAA0768 protein 1,7 
128499 AA487503 Hs. 100636 ESTs 1.698 
120780 AA342337 Hs.241569 ESTs; Modtly smir to I! ALU SUBFAMILY SQ WARNING ENTRY II [H^aplens] 1.697 
132920 L06133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 
135037 U77948 Hs. 184122 general transcription factor II; i 1.696 
110024 H11297 Hs.31050 ESTs 1.695 
134415 AA329274 Hs.82911 protein tyrosine phosphatase type IVA; member 2 1.694 
102223 U24685 Hs. 146226 Human anti-B cell autoantibody IgM heavy chain variable V-D-J region (VH4) 

gene; dons El 1 ; VH4-63 non-productive rearrangement 1 .694 

126712 AA205862 Hs.7942 ESTs 1.694 

101507 M27492 Hs.82112 interieukin 1 receptor; type I 1.692 

106291 AA435551 Hs.30824 ESTs 1.691 

116826 H58691 Hs.8215 ESTs; Weakly similar to double-stranded RNA-blndlng nuclear 

protein DRSBP76 [H^apiens] 1 .69 

135339 D59269 Hs. 127642 Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 783648 1 .69 

118250 N62602 y275b6^1 Soaresjnuttipie_scterosis - 2NbHMSP Homo sapiens cDNA clone 

IMAGE288851 3' similar to contains Alu repetitive element;, mRNA sequence 1 .689 

106470 AA450118 Hs.186180 ESTs 1.688 

108203 AA057678 Hs.63408 ESTs 1.687 

110748 W70313 Hs.126906 ESTs 1.686 

116576 D51228 Hs.79404 neuron-specific protein 1.683 

123035 AA481392 Hs.105166 ESTs 1.683 

126668 AA011616 Hs. 184086 ESTs 1.681 

101512 M28209 Hs.250716 RAB1; member RAS oncogene family 1.678 

1 02704 U76638 Hs.54089 BRCA1 associated RING domain 1 1 .677 

126218 AA256386 Hs. 13649 Novel human gene mapping to chomosome 13; simQarto rat RhoGAP 1.676 

111180 N67277 Hs.8403 ESTs 1.676 

105937 AA404342 Hs. 173531 ESTs 1.675 

114118 Z38520 Hs. 175930 ESTs 1.675 

109203 AA190634 Hs. 108787 endoplasmic retiwlum membrane protein 1.675 

125245 W86608 Hs.7243 ublqultin specific protease 24 1.675 

102906 X06956 Hs.75318 tubulin; alpha 1 (testis specific) 1.675 

125914 AA262925 Hs.180034 cleavage stimulation factor; 3? pre-RNA; subun'rt 3; 77kD 1.674 

134294 U63289 Hs.81248 CUG triplet repeat; RNA-bindlng protein 1 1.674 

109742 F10108 Hs.183333 ESTs 1.673 

134674 D63876 Hs.87726 KIAA01 54 protein 1.673 

104079 AA402937 Hs.103238 ESTs 1.671 

107554 AA001386 Hs.59844 ESTs 1.671 

132439 AA243139 Hs.4863 Homo sapiens done 25088 mRNA sequence 1.669 
124515 N58172 Hs.109370 ESTs 1.668 
124300 H92575 Hs.105959 ESTs; Weakly similar to !! ALU SUBFAMILY SQ WARNING ENTRY II [H^aptens] 1.668 
126809 AA743475 Hs.171693 ESTs 1667 
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106095 AA419547 Hs.11713 ESTs 

101754 M77142 Hs.239489 TIA1 cytotoxic granule-associated RNA-btnding protein 

105168 AA1 92306 Hs.23926 ESTs 

113582 T91371 Hs.16824 EST 

119559 W38197 Accession not listed In Genbank 

119961 W67535 Hs£9015 ring finger protein 9 

123255 AA490890 Hs.105273 ESTs 

111078 N59230 Hs.186574 ESTs 

113082 T40528 Hs.8246 ESTs 

119589 W44692 Hs.124177 ESTs 

104308 D53639 Hs.77904 ribosoma! protein S26 

103073 X59417 Hs.74077 proteasome (prosome; macropain) subunit; alpha type; 8 

124424 N35314 Hs.107265 ESTs 

128890 AA096157 Hs.182364 ESTs; Weakly similar to 25 kDa trypsin Inhibitor pisapiens] 
119400 T92767 ye27d06.s1 Stratagene lung (#937210) Homo sapiens cDNA clone 

IMAGE:1 18955 3', mRNA sequence. 

131631 AA486868 Hs.29802 slit (DrosophUa) homotog 2 

1 18229 N62339 Hs. 180532 heat shock90kD protein 1; alpha 

118533 N67954 Hs.49413 ESTs 

130666 AA476307 Hs.194035 KIAA0737 gene product 

103093 X60708 H&44926 dipeptidylpeptidase IV (C026; adenosine deaminase complexlng protein 2) 

128667 U69140 Hs, 1034 19 fasdcuiation and elongation protein zeta 2 (zygln il) 

112933 T15530 Hs.221439 ESTs 

114546 AA056263 Hs. 132747 ESTs 

126705 AA579377 Hs. 180532 heat shock 90kD protein 1; alpha 

114399 AA007595 Hs.220937 ESTs 

118836 N79820 Hs.50854 ESTs 

100401 D85423 Homo sapiens mRNA for Cdc5, partial cds 

105681 AA284865 Hs. 171 228 KIAA1 040 protein 

132526 AA460128 Hs.5074 similar to S. pombe dim 1+ 

133809 AA034002 Hs.76359 catalase 

115968 AA447083 Hs. 134522 ESTs 

116370 AA521256 Hs.236204 ESTs; Moderately similar to NUCLEAR PORE COMPLEX 

PROTEIN NUP107 [Rjwrveglcus] 

109644 F04477 HS.2048Q2 ESTs; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 

DEHYDROGENASE; UVER [Ksaplens] 
103427 X973Q3 H.saplens mRNA for Ptg-12 protein 

132186 T33888 Hs.221040 KIAA1038 protein 

131428 U17838 Hs-26719 PR domain containing 2; with ZNF domain 

126638 AA649257 Hs. 188602 ESTs 

114503 AA039568 Hs.188083 ESTs 

121242 AA400857 Hs.97509 EST 

122414 AA446885 Hs.99087 ESTs; Moderately sfmiiarto ZiNC FINGER PROTEIN 141 [H^apiens] 

110632 H72344 Hs.171635 ESTs 

111389 il95837 Hs.169111 ESTs; Weakly similar to LB2A[Djnelanogastet] 

112449 R63802 Hs.124186 ring finger protein 2 

113070 T33464 Hs.6298 ESTs 

107229 D59284 Hs34644 ESTs 

132710 W93726 Hs.55279 protease inhibitor 5 (maspln) 

124664 N94814 Hs.33540 ESTs; Weakly similar to K1AA0765 protein [H.sapiens] 

130168 AA350690 Hs.151411 KIAA0916 protein 

125040 T78451 Hs.199961 ESTs 

132972 H39627 Hs.164967 ESTs; Weakly similar to II ALU SUBFAMILY SB WARNING ENTRY f! pisapiens] 

115873 AA433916 Hs.90093 heat shock 70kD protein 4 

120408 AA235045 Hs.190151 ESTs 

120934 AA383773 Hs.191500 ESTs 

115259 AA279071 Hs.13453 splicing factor 3b; subunit 1 ; 155kD 

134330 j>201 13 Hs.8185 ESTs; Highly similar to CGI-44 protein [H^aplens] 

115117 AA256492 Hs.49007 poly(A) polymerase 

125162 W44682 Hs.109696 ESTs 

103946 AA285246 Hs.1 1 1650 ESTs; Weakly similar to Prt1 homoiog [Haptens] 

133389 AA166917 Hs.72639 ESTs 

1 15528 AA342301 Hs.53929 ESTs; Weakly similar to 11 ALU CLASS B WARNING ENTRY II [H^apiens] 

129704 W81301 Hs.12064 ubJquitin specific protease 22 

109313 AA206800 Hs.86276 ESTs; Moderately similar to zinc finger protein dp [H.saplens] 

130457 U58091 Hs.155976 cullin 4B 

123076 AA485211 Hs.190048 ESTs 

115113 AA256460 Hs.44810 ESTs 

117731 N46433 Hs.46609 ESTs 
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123344 AA504338 Hs.171857 ESTs 1599 

131798 X88098 Hs.3238 adenovirus 5 E1A binding protein 1597 

125370 AA256743 Hs.151791 KIAA0092 gene product 1596 

114918 AA236813 Hs.72324 ESTs; Highly similar to unknown [H-saplens] 1596 

5 114807 AA160805 Hs.199832 ESTs 1596 

105103 AA151593 Hs.10130 ESTs 1594 
125004 T60120 yb68f02.s1 Stratagene ovary (#937217) Homo sapiens cONA clone 

IMAGE:76347 3', mRNA sequence. 1 .592 

105558 AA282914 Hs.10176 ESTs 1.589 

1 0 1 10455 H52172 yt85e8.s1 Soares_pineaLgkndJI3HPG Homo sapiens cONA done 

IMAGE231 1 1 3' similar to contains Alu repetitive element;, mRNA sequence 1 589 

119780 W72987 Hs. 191 381 ESTs; Weakly similar to hypothetical protein [H^apiens] 1.567 
126983 AA21 1537 zn55d01 .r1 Stratagene muscle 937209 Homo sapiens cONA clone 

IMAGE562081 5', mRNA sequence. 1.586 

IS 134675 AA250745 Hs.87773 protein kinase; cAMP-dependent; catalytic; beta 1.584 

105431 AA252033 Hs.15036 ESTs; Weakly simitar to II ALU SUBFAMILY J WARNING ENTRY II [H^aplens] 1584 

120187 Z40251 Hs56974 ESTs 1.584 

115830 AA428137 Hs.86434 ESTs 1581 

135069 AA456311 Hs.93961 ESTs; Weakly similar to f! ALU CLASS A WARNING ENTRY II [H^aplens] 1.581 

20 122997 AA479295 Hs.106290 Kelch motif containing protein 1 .581 

119707 W67569 Hs.44143 ESTs; Weakly similar to SNF2alpha protein [H.sapiens] 1.58 

131934 D80348 Hs.34922 ESTs 158 

106141 AA424558 Hs.9302 phosducin-like 158 

115271 AA279422 Hs5724 ESTs 1.579 

25 131468 R27598 Hs^7197 KIAA0797 protein 1577 

131165 R98173 Hs23763 Max-Interacting protein 1575 

117273 N21680 Hs.43047 ESTs 1575 

101569 M33772 Hs.182421 troponin C2; fast 1575 

116127 AA459703 Hs.79070 v-myc avian myelocytomatosls viral oncogene homolog 1.575 

30 120022 W90625 Hs58432 ESTs 1575 

117512 N32157 Hs.82207 ESTs 1574 

106511 AA452865 Hs.206713 UDP-Gal:betaGlcNAc beta 1^-gaiactosyltransferase; polypeptide 2 1573 

116415 AA609204 Hs.27973 KIAA0874 protein 1573 

127879 AA810215 Hs.189079 ESTs 1571 

35 125211 W72798 Hs.103177 ESTs; Wkly smlr to cDNA EST EMBLD32579 comes from this gene [Celegans] 1571 

114748 AA135638 Hs.223756 ESTs 1.571 

122698 AA456112 Hs.99410 ESTs 157 

116765 H12636 Hs.121585 ESTs; Weakly simitar to reverse transcriptase [H^apiens] 1.568 

130895 AA609828 Hs.21015 ESTs; Highiy similar to tBtracyc&ie transporter-like protein [M jnuscutus] 1568 

40 114338 241366 Hs.40109 K1AA0872 protein 1-567 

111005 N53076 Hs5996 ESTs 1367 

128135 AA913491 Hs.189143 ESTs; Modrtiy smlr to II ALU SUBFAMILY J WARNING ENTRY 11 [H^aplens] 1567 

112046 R43365 Hs.22273 ESTs 1.566 

132160 AA281770 Hs.184081 seven in absentia (Drosophila) homolog 1 1.566 

45 111568 R10153 Hs.20561 ESTs 1-566 

127775 H04106 Hs. 179902 ESTs; Weakly similar to NG22 [H.sapiens] 1.566 

115359 AA281936 Hs.88914 ESTs 1.568 

121845 AA425734 Hs. 165066 ESTs; Weakly similar to hypothetical protein [H.saptens] 1565 
127854 AA769520 ESTs; Weakly similar to REGULATOR OF MITOTIC SPINDLE 

50 ASSEMBLY 1 [H^apiens] 1.564 

120287 AA187679 Hs.111114 ESTs 1.563 

114940 AA243012 HsJ5928 ESTs 1-562 

126716 AA031700 Hs.251862 ESTs * 1-562 

134161 U97188 Hs.79440 IGF-ll mRNA-bindlng protein 3 1.561 

55 125390 H95094 Hs.75187 translocase of outer mitochondrial membrane 20 (yeast) homolog 1-561 

115334 AA281244 Hs.65300 ESTs 1.559 

113721 T97931 Hs.18190 EST 1358 

114895 AA236177 Hs.76591 KIAA0887 protein 1558 

119341 T62571 Hs.146388 mtorotubule-associated protein 7 1.558 

60 108012 AA039616 Hs51933 ESTs 1.558 

130335 AA1 56499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 1557 

134351 R82074 Hs.82109 syndecan 1 1357 

133300 D51401 Hs.70333 ESTs 1353 

106920 AA490899 Hs.24462 ESTs 1353 

65 118744 N74075 Hs.94293 EST 1352 

126489 W20016 Hs.144228 ESTs; Weakly similar to ZINC FINGER PROTEIN 83 [H.sapiens] 135 

115913 AA438720 Hs.65487 ESTs 1.55 

107868 AA025234 Hs.61260 ESTs 1.55 

134520 N21407 Hs.257325 ESTs 1.55 
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109703 F09684 Hs.24792 ESTs; Weakly similar to ORF YOR283w[S.cerev!slae) 135 

120286 AA187938 Hs35189 ESTs; Weakly similar to F25B53 [Celegans] 1348 

106356 AA443277 Hs31034 peroxisomal biogenesis factor 1 1 A 1348 

129460 AA235627 Hs.1 1171 APG5(autophagy5;S.cerevls1aeHke 1.547 

133950 011861 Hs.77823 ESTs 1346 

128172 AI400862 Hs.142607 ESTs 1346 

114162 Z38909 Hs£2265 ESTs 1345 

101803 M86546 Hs.1 55691 pre-B-ceil leukemia transcription factor 1 1344 

113617 T93630 Hs.17207 ESTs 1342 

104896 AA054228 Hs^3165 ESTs 1341 

114477 AA032013 Hs.144260 EST 134 

110731 H98653 Hs.1 88006 KIAA0878 protein 134 

130367 Z38501 Hs.8768 ESTs; Wkfy smlr to II ALU SUBFAMILY SQ WARNING ENTRY II [H.sapiens] 1338 

130539 L07044 Hs. 250857 Homo sapiens ca Id 1338 

134921 W60186 Hs.169487 Krelsler (mouse) maf-related leucine zipper homolog 1337 

130583 W24957 Hs.16281 ESTs; Moderately similar to similar to Celegans protein 

encoded In cosmkl T20D3 [H.sapiens] 1 337 

133723 AA088851 Hs.75744 S-adenosylmethionine decarboxylase 1 1.537 

106450 AA449469 Hs.1 1859 ESTs 1.536 

104120 AA429838 Hs.89519 KIAA1048 protein 1336 

100533 HG1879-HT1919 Ras-Uke Protein Tc10 1335 

130664 R09049 Hs.17625 ESTs 1335 

127122 AA279153 Hs.190049 ESTs 1335 

134264 T03391 Hs.8087 ESTs 1335 

132319 AA418662 Hs.44625 ESTs 1335 

115465 AA266941 Hs.43691 ESTs 1.533 

125003 T59442 Hs. 100445 ESTs 1332 

102273 U30888 Hs.75981 ublquhin specific protease 14 (tRNA-guanine transgiyoosylase) 1332 

121875 AA426299 Hs.98510 ESTs 1.532 

114366 Z41747 Hs.469 succinate dehydrogenase complex; subunR A; fiavoprotein (Fp) 1331 

132944 AA054515 Hs.6127 ESTs; Weakly similar to prostate-specific transglutaminase [H.sapiens] 133 

111199 N68210 HS-29822 ESTs 133 

113494 T88878 Hs.258738 ESTs 1329 

129515 AA49Q882 Hs.1 12227 ESTs 1328 

133124 AA156049 Hs.65490 ESTs 1328 

104785 AA027163 Hs.7942 ESTs 1326 

105595 AA279408 Hs.25866 ESTs 1326 

130198 U67156 Hs.1 51 988 mltogen-activated protein kinase kinase kinase 5 1326 

114297 Z40758 Hs.1 73091 DKFZP434K151 protein 1.525 

112876 T03488 Hs.4842 ESTs 1325 

127500 AA525014 Hs.162115 ESTs 1325 

120519 AA258585 Hs.129887 cadherin 19 (NOTE: redefinition of symbol) 1325 

119859 W80702 Hs.58461 ESTs 1325 

129944 L00389 Hs.1381 cytochrome P450; subfamily I (aromatic compound-induc&ile); polypeptide 2 1324 

118864 N89870 Hs.42148 ESTs; Weakly similar to Su(P) [D.melanogaster] 1323 

123964 C13961 Hs.210115 EST 1323 

111676 R19414 Hs.166459 ESTs 1322 

128332 AI079523 Hs.134173 ESTs 1322 

130455 X17059 Hs.155956 N-acetyitransferase 1 (arylamine N-acetyltransferase) 1321 

125181 W5B461 Hs.12396 ESTs 1321 
127093 AA768241 oa72dQZs1 NCI_CQAP_GCB1 Homo sapiens cDNA clone 

IMAGE:1317795 3\ mRNA sequence. 1321 

132156 AA157401 Hs.4113 S-aaerwsylhomocystelne hydrolase-like 1 - 1321 

125303 Z39821 Hs.107295 ESTs 132 

132697 AA281951 Hs3518 Homo sapiens mRNA; cDNA DKFZp566J2146 (from clone DKFZp568J2146) 132 

117086 H93135 Hs.41840 ESTs 1319 

113355 T79203 Hs.14480 ESTs 1318 

108621 AA101811 Hs.69506 ESTs 1318 

109384 AA219172 Hs.86849 EST 1318 

128510 X94703 Hs.1 0031 6 RAB28; member RAS oncogene family 1317 

132968 N77151 Hs.81638 myosin X 1315 

117035 H88798 Hs.41182 ESTs 1.515 

116781 H22985 Hs32132 ESTs 1.513 

108677 AA1 15629 Hs.1 18531 ESTs 1*513 

130214 H76003 Hs.1 5266 ESTs 1-513 

134700 AA481414 Hs.8868 golgl SNAP receptor complex member 1 1.512 

116618 D80783 Hs.45224 ESTs 1308 

126257 N99638 tumor necrosis factor receptor superfamily; member 10b 1-508 

125659 AA806808 Hs.1 18797 ub^itfrKonjugatlng enzyme E2D 3 (homologous to yeast UBC4/5) 1-508 
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113837 W57698 Hs.8888 ESTs 1.507 

114317 Z41038 Hs.469 succinate dehydrogenase complex; subunit A; Apoprotein (Fp) 1.507 

100311 D50540 Hs.184653 phosphodiesterase 3B; cGMP-lnhfljited 1507 

126802 AA947601 Hs.97058 ESTs 1506 

128661 R82837 Hs. 103329 KIAA0970 protein 1506 

134194 AA233231 Hs.79828 ESTs 1506 

108953 AA149652 Hs.42128 ESTs 1504 

133240 D31161 Hs.68513 ESTs 1502 

132671 X76302 Hs.54649 putative nucleic acid binding protein RY-1 1501 

132609 Z48923 Hs53250 bone morphogenetic protein receptor; type II (serine/threonine kinase) 1501 

105574 AA278678 Hs^58567 ESTs 1.5 

113718 T97782 Hs.256268 ESTs 1.5 

127824 AI208365 Hs,127811 ESTs 15 

130132 U55936 Ha.184376 synaptosomatassociated protein; 23kD 1.5 
127394 AA453224 ESTs; Weakly similar to II ALU SUBFAMILY J WARNING ENTRY II [Ksaplens] 1 5 

100485 HG1111-HT1111 Ras-Like Protein Tc21 15 

101078 L04510 Hs.792 ADP-ribosylation (actor domain protein 1;84kD 15 

128611 AA456845 H 3. 102471 KIAA0680 gene product 15 



226 



WO 02/30268 



PCT/US01/32045 



TABLE 12A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey. Unique Eos probeset Identifier number 

CAT number. Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



108536 118811J 
117040 46956 J 
100782 18457.1 



100819 3022J 



100824 5_36 



125004 264197J 
102313 27608J 
102337 553_1 



124704 



124825 
110455 
126257 
125624 
104038 
103427 



292319,1 
185904J 
330773 1 
46874.1 
182217 1 
154135J 
264235J 
43892.1 



104142 113242 1 
127093 47721 J 



AA084524 AA339253 AW966289 

AW970600 AA503323 H89218 APD66031 H89112 

AA355435 NM.Q01 516 Z30093 T28405 AW949486 AA461142 AA410532 A1652073 AA521208 AI970141 AI968234 Al 026 102 
AA713583 AW135876 AA936614 AA770300 AI242635 AA377033 AW960263 AW607683 AI273603 AA410287 A I CMOS 13 
AA460838 AI803916 AW294095 AW449680 AW798677 AW675048 BE5421 16 AL12Q521 

L34840 NMJ003241 U31905 AI546931 AI791616 AI973065 AI792321 A1546937 AI685880 AI732835 A1682360 AA420653 
AA564047 AI682323 AI824614 AI659889 A1680052 A1970837 AI623108 AA420692 AI418074 AA631018 AI810595 AW291463 
AW449930 A1668908 AI970818 

AI393237 A1521317 AI761348 AF025841 D43968AW994987 L34598 AF025841 D89789 D89788 D89790 AW998932 
AI971742 AI310238 X90976 AW139668 AW674280 AI365552 AA877452 AV657554 C75229 AA376077 A1798056 AW609213 
W25586 H30149 BE075089 BE075190 AW580858 H99598 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467960 BE158135 BE 158 126 BE158145 N92660 AA847246 At961688 AI361423 AA878154 AA043767 AI863712 
AI559226 AW339007 A1371268 A1368901 AA046624 AA134739 AW449154 AA130232 A1458720 AA96251 1 AI700627 
R70437 AW004008 AA045229 A1671572 H99599 AA043768 AI685454 Al 87 1685 N29937 X90977 AA524240 AI1421 14 
A1825750 A1567805 AI631365 AI347893 AA134740 F20669 AA046707 AW793216 AW963298 AW959380 AA363265 
AI784593AI268201 R69451 AV657618 AI695588 

BE312163 AJ230798 AA374482 A1926059 AA622653 AI860704 BE139185 AW296884 T60238 T60120 
U33921 A1190489 AA573311 

AI814663 AA806761 AA765241 AA019317 AA092255 AA035405 TB5079 AA890151 AI373959 TB5080 BE153728 AA740848 
BE0806B2 AL048137 AW1B2316 AI699463 AW274481 AW407538 AA306562 AW950024 AW949943 AL045703 AW843196 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AA385181 AA164998 
AI246476 AA345406 AI277554 AA134749 AA856624 BE613247 AA299003 AL0481 38 AA028121 T92510 AI923835 
AW020440 AI401S94 MB89401 N93290 AA044247 AA028100 AI5B2845 AA811151 AI741811 AI925878 AA448277 AA1 72221 
AI214783 BE220793 AA022746 AI082882 AA022849 AI928385 AA573472 AI420686 AW072902 AI799493 A1873506 
AI468977AI192079 AI468976AA044272 AW015701 AW31 6979 AA933042 AA609017AJ31 8393 AI424571 AI934945 
AA172023 AW050917 AA846180 AA134746 AI003947 AJ769769 AW006697 AAB53517 AW575680 AI474214 AA401478 
U36922 AA927064 AA868000 D62654 T91745 AW500202 AA194764 AA746346 AA130464 AW1 17498 AA054526 N26432 
H02534 K04964 AW303367 BE300931 AI218049 A1208073 AW182749 AA983630 AI147585 AA194765 AA054534 AA922720 
AI436585 A1346535 AA134269 AA280923 AA897422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 A1216046 
AW496823 AA019414 H82288 W35284 AI936621 A17671 13 AA866177 AW367874 H82398 AF032885 AW300151 AW467069 
AA809346 AI188507 AI494178 AA872752 Al 63 1631 U02310 NM_002015 AA815006 AI382453 AW197658 AI761654 
AI804396 AI382221 AI813640AI439635 AI523901 AW517242 A12217Q5 AW298104 AW20456O AW573095 AW028783 
AW014650 AI766744 AI808294 A1698758 AJ041809 A1766667 AI479103 AA872797 AA769305 AA765080 AA334166 
AI472322 
R07335R07640 

AW953679 AW953680 AA244436 H82527 AA361046 AA244483 H82526 

AA501669R52088 

H52576AF085971 H52172 

N99638 AW973750 AA328271 H90994 AA558020 AA234435 N59599 R94815 

AW968363 AA465492 R34539 AA16541 1 

AA374532AA421255 

BE514383 AA071273 AW247987 AW673286 BE3121Q2 AW749824 BE071985 AW577383 BE071945 BE072005 AW577355 
BE0719S5 AW239231 BE072000 BE071960 AW577360 AW749830 AW373020 X97303 AW999522 BE000192 BE562219 
BE266655BE264970 
AA074713AA447006 

AW977549 AA256038 AL365415 AW500455 AA768241 AW968097Z17849 AA256104 
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125873 
125954 



125992 
127210 



10492J 
4457J 



1589048_1 
15307.6 



127263 
135197 

127394 
126879 
1269B3 
120470 
127854 
121367 
106320 



115479 
101026 



232161 1 
29440J 

304844J 

1860.2 

171841J 

188975J 

443883J 

280429,1 

6435J 



201515J 
11075.1 



100401 24827.1 



130542 28089.3 



100485 30576.2 



108345 
100522 

100533 

100598 



102332 
118250 
103878 
119400 
119559 



112277.6 
.1 



32905.1 
23902J 



14745.3 

genbank_N62602 
entrac284483 
ganbank_T92767 
entra*_W38197 



AW271838 AL133605 C01 646 H29959 AA999896 D60676 AW999454 AW961 176 AA315244 H14437 AW3861 18 N46512 
AW272021 AI768516 BE466421 AI0828Q9 AI804454 AA905101 AW173368 N38942 AW614169 AI080483 N29489 AI500550 
AA994475 AA614464 AA707368 AA593145 AA569473 AW627815 AI828244 N63226 N42300 

NM.016353 AB023584 W44753 R09585 AA382865 R23772 AI314257 AA974046 AK001608 AI935638 AW440609 AI420022 
AA777388 AA806969 AI554876 AI584006 AI688556 AI688634 AI697997 AI0 14540 AI806683 Al 741 202 AW263154 
AW297238 AI149951 AI589076 AW082158 AW614265 AA931887 M761969 R09490 AA484643 AI207121 AI088390 
AI538065 AI619547 Al 741 925 AT702846 H40846 R93943 AW747979 AA461348 U30163 AA326023 AI535992 AW242870 
AI244025 AI222558 W38425 AW473630 AI624599 A1921226 A1683152 AI096458 AI123822 AW170802 C16447 AI337674 
D25726 AW339368 AW771259 AM61 174 
H48372 W01626 
AA305278 AA223833 

1 10924 6443.1 AW058463 AF195768 AA680145 T86901 W60373 W60281 NM.007222 AF106862 AI000795 AA167188 
AW884503 AW891313 AW891332 AW891312 AI984924 A1123518 N75170 AA131614 H25330 AI913358 AI742277 W25576 
R58771 AW445159 AWB88628 AW888627 AW274674 AI088482 N52314 N34282 AW001789 AI338943 T66784 AI288963 
AW468676 AW237528 H25289 N71690 AA610126 AI143458 AI082599 N49144 AA854773 AW66341 1 AW610151 N47938 
AW601626 AA167189 AA916304 AA805205 BE069496 AA652836 BE069499 AI699298 AW249926 AW888578 BE567635 
T10726 AW604715 D54245 D53062 D55610 D55555 AA301376 AI133498 N77788 A1936320 AW090734 A1269977 N50828 
AA550814 AI421993 AI005384 N5Q813D60292 D59349 AA131710D81698 D81699 
AA331 156 AA331 157 AA331155 

U76456 NM.003256 AF057532 AA193414 AW293304 AW963378 AA313095 AI359841 AI969312 AI080163 AW448926 

A1671 136 BE466399 AI637967 AI671873 AW196583 AW071635 AI634427 AW296872 AW292470 AA193650 

BE161832 AA453224 AA485772 

D90391 M55575AI852268AA719776 

AA524886AW971347AA211537 

AW971327 AA524988 AW628653 AA251797 

AW976796AA769520 

AA432071 AA405648 AW000908 T16347 

AB028957 AL120001 AI267678 H10928 R19844 AW970334 AA393182 F05472 F1 171 1 H09908 N50250 AI81541 1 BE463679 
D61468AW97Q253D60889C15548 D61011 D60367 AJ315795 AA534831 D81386 AW235039 A1382158 D81174 AM16899 
AA852310 H09789 H10929 H09813 F09389 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
AK)1 8713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608 T05327 T071 18 AA339352 
AW301608 N46706 AA649093 AA287595 AW81 1753 AA287596 N39260 

NM.001874 J04970 T91426 AW205201 T84979 AA255727 AA847837 R02164T91339 AV651884 AV651835 AV651350 
AV650118AV651338 A12720Q2 A1367796 AA830651 AA262112 AW151198 

AU076696 AA219720 AL135197 AA305877 N56376 AA318063 AA130725 AW954903 BE541230 AW383312 U86753 D85423 
AI679458 Al 122 9 32 AB007892 AI583919 BE160134 F08104 R34903 F13440 AA095444 AA262453 AA191036 R17895 
T81266 BE149776 AI279537 A11431 13 AA361072 AW959030 AW268817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE000290 AA768053 F09494 BE092645 BE172Q99 Z41 177 AAO44750 AI909768 BE140785 BE140574 AW845210 
AW752452 BE243244 AA843664 AI300080 BE169032 AW189979 BE004869 AA621872 Al 95 1772 AE676897 AI926593 
N62813 AI350912 AW608791 AI309602 AI983138 AW875592 AI655073 AW875626 AA130606 AI370827 C75528 C75554 
AW263335 AI344426 BE004788 AA576220 AA604824 AI431405 AA749378 R38882 AW955075 AA173821 G75657 
AA219672 AW768408 R43141 AI431414 AA483343 AI873792 T17294 AW770187 N74285 AI476404 AI088268 AA654152 
AW974864 BE617311 BE243328 BE168049 

U64675 AW167507 AW 167503 BE218568 AA779360 W85722 AL044843 BE159404 AF012086 AW89861 1 AW898610 
BE159405 BE092191 AW890826 AW369841 AW368064 AW606702 AL044731 R82691 AA419346 AM16558 H96045 
AL040450 AI640531 AJ808434 AL046613 AW855784 AW362469 AL048881 AL049015 AA094272 AA888908 AM17294 
AW237786 R59793 AL044916 D82402 AI216854 AI079342 H96406 AL037845 AI915900 AA972133 AI478783 T31074 
Z21 135 Z21396 AA352182 R13918 AA430178 C17811 AI371824 AI742256 AA926801 N79156 AA350610 AA081971 N83639 
R35544 AA312292 AW952080 N42322 M171957 AA565297 R89207 AA504106 AI630782 AA826482 AI301579 T36241 
AW966618 Z28426 AL043480 A1124636 AA393449 T19504 AW887823 AI289814 N53979 AL043571 AI632764 AE859613 
AI986308AI683212 AI984499 AI133258 C05898 AW512761 A1041260 BE466240 Z19161 AI351 190 N 67549 AI373374 
AA400873 AW440914 AW514879 AA770146 AI358754 R51 1 13 AI233773 AA649886 T30543 D54358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 T85597 N62441 R89099 R00035 T85596 R61335 R00128 N63359 AI535964 
A1207768 M31468 NM.012250 W0 1322 AA253280 AA253233 AA293148 AW582106 R79880 AA459547 AA363459 
AA234396 N31669 H44468 AA434567 AW363088 AW993541 
AA070906AA070934 

X51 501 NM.002652 Y1 01 79 J03460 AI791618 AI821473 AA916588 AA564296 AA9161 10 AB72286 AI420470 AI568790 
AI597724 AW205207 AI659305 AI791620 AA532383 AE821475 AA526498 

NM.012249 M31470 AL043108 AA262561 AA178883 T29433 AA313329 W48807 AW404323 AA453560 AW403227 H94816 
W17101 AA165152 W23989 AA091310 

AL121734 D54896 AA424269 BE242906 AA3621 18 BE018454 AI280348 AL048769 M35543 AA757734 Al 128865 H20289 

H23728 AI203445 H41481 H18237 H44081 H92839 AI928621 H75675 D51 148 AI796198 AW390453 D55579 D54145 D53998 

D54015 R37664 H17541 AA668681 T65061 R15867 AW468123 R16049 H69030 AA054226 H16070 F09655 R92144 T03521 

R05473 H92840 AA018186 R91707 

U35637AA112989Z19308 

N62602 

Z84483 

T92767 

W38197 
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TABLE 13: shows genes, including expression sequence tags, up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 



Pkey: 


Unique Eos probeset Identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnlgenelD: 


Unigene number 




Unigene Title: 


Unigene gene title 




R1: 


Background subtracted normal prostate : prostate tumor tissue 




Pkey ExAccn 


UnlgenelD Unigene Title 


R1 


333516 


CH2__FGENES.173_1 


0.028 


337954 


CH22_EMiAC005500.GENSCAN.96-3 


0.029 


332496 R73299 


Hs.204354 ras homolog gene family; member B 


0.03 


337944 


CH22_EMAC005500.GENSCAN.89-7 


0.033 


334111 


CH22_FGENES.330J0 


0.O33 


333657 


CH2a„FGENES541_2 


0.034 - 


327718 


CH.04_hsgi|6525284 


0.034 


336355 


CH22_FGENES.817J5 


0.035 


322011 AL137354 


EST cluster (not in UniGene) 


0.035 


336377 


CH22.FGENES.821 5 


0.036 


300254 AW079607 


Hs.188417 ESTs; Weakly similar to ZnT-3 [H.sapiens] 


0.037 


330096 


CH.19_p2gl|6015278 


0.037 


335191 


CH22_FGENES.507_6 


0.038 


334040 


CH22_FGENES.322_8 


0.039 


333586 


CH22_FGENES.204J2 


0.04 


333295 


CH22J=GENES,132_2 


0.042 


313326 AJ088120 


Hs.122329 ESTs 


0.043 


329517 


CH.10j)2gi|3983513 


0.043 


333403 


CH22J=GENES.144_21 


0.043 


335226 


CH22 FGENES.513 11 


0.044 


335976 


CH22_FGENES.652_.11 


0.045 


333637 


CH22 FGENES.229_2 


0.046 


334582 


CH22_FGENES.407_5 


0.046 


336437 


CH22_FGENES.B28_4 


0.047 


337461 


CH22_FGENES.782-1 


0.047 


302892 N58545 


Hs.6975 hlstone deacetylase 3 


0.049 


338689 


CH22_EM'AC005500.GENSCAN.475-3 


0.049 


334721 


CH22_FGENES.421_32 


0.049 


305667 AA864572 


EST singleton (not in UniGene) with exon hit 


0.049 


335498 


CH22_FGENES.571__7 


0.05 


311596 A1682088 


Hs.223368 ESTs 


0.05 


326959 


CK21_hsgi|6469836 


0.051 


311688 AW025661 


Hs.240090 ESTs 


0.052 


317298 AI922374 


Hs.158549 ESTs 


0.052 


332984 


CH22_.FGENES.54_6 


0.052 


321039 AW247083 


EST cluster (not in UniGene) 


0.053 


335844 


CH22_FGENES.623_4 


0.053 


325371 


CH.12_hsgl|5866920 


0.054 


335667 


CH22_FGENES.590_1 8 


0.054 


333635 


CH22.FGENES.228J 


0.054 


336736 


CH22J=GENES.110-2 


0.055 


335893 


CH22J=GENES.635J 


0.055 


333170 


CH22.FGENES.94_5 


0.055 


329768 


CH.14j)2gi|6015501 


0.055 


334030 


CH22„FGENES.320_2 


0.055 


323359 AA234172 


Hs.137418 ESTs 


0.055 


300453 AW051431 


Hs. 113029 ribosomal protein S25 


0.055 


334262 


CH22_FGENES.367 J 2 


0.055 


306590 A1000246 


EST singleton (not in UniGene) with exon hit 


0.055 


331087 R22520 


Hs_ 13398 ESTs 


0.055 


338620 


CH22_EM:ACOQ5500.GENSCAN.450-18 


0.056 


339045 


CH22_DA59H18.GENSCAN28-5 


0.056 


308023 AI452732 


EST singleton (not In UniGene) with exon hit 


0.057 
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339067 CH22_DA59H18.GENSCAN.33-3 0X67 

335689 CH22_FGENES.596_4 0.057 

339069 CH22_DA59H18.GENSCAN.33-5 0.057 

338176 CH22_EM^C005500.GENSCAN5194 0.057 

5 328159 CH.06Jisgi|5868065 0.058 

335655 CH22.FGENES.590 6 0.058 

336371 CH22J=GENES.820_1 0.058 

336558 CH22_FGENES.842_3 0.059 

337738 CH22_EMAC000097.GENSCAN.10(M 0.059 

10 334273 CH22_FGENES.369_2 0.059 

335889 CH22_FGENES.633_3 0.059 

327807 CR05_hsgi|5867968 0.059 

333315 CH2a_FGENES.138_7 0.059 

338825 CH22_OJ246D7.GENSCAN.4-6 0.06 

15 337612 CH22_C20H12.GENSCAN.22-5 0.06 

333897 CH22_FGENES293_4 0.06 

335990 CH22_.FGENES.655 4 0.06 

334264 CH22_FGENES.367J5 0.06 

338653 CH22.EMAC005500.GENSCAN.460-39 0.061 

20 322303 W07459 EST cluster (not In UniQene) 0.061 

333498 CH22.FGENES.168 8 0.061 

336522 CH22_FGENES,839 3 0.061 A 

301357 AW285677 Hs.137840 ESTs; Moderately similar to HOMEOBOX 

PROTEIN SIX1 [H^apiens] 0.062 

25 305917 AA876469 Hs.181 357 laminin receptor 1 (67kD; ribosomal protein SA) 0.062 

336143 CH22_FGENES705_5 0.063 

333493 CH22_FGENES.168_2 0.063 

332533 M99487 Hs.1915 folate hydrolase (prostate-specific membrane antigen) 1 0.063 

325844 CK16_hsgi]6552453 0.063 

30 336402 CH22_FGENES.823_17 0.063 

335767 CH22_FGENES.607_1 0.064 

301893 T80334 EST cluster (not in UniGene) wim exon hit 0.064 

324019 AW177009 EST cluster (not fn UniGene) 0.064 

305801 AA845997 EST singleton (not in UniGene) with exon hit 0.064 

35 335188 CH22_FGENES.507_3 0.065 

337533 CH22J=GENES.828-2 0.065 

333311 CH22_FGENES.138_3 0.065 

335668 CH22.FGENES£90_19 0.065 

306786 AI041589 EST singleton (not In UniGene) with exon htl 0.066 

40 306365 AA962086 EST singleton (not In UniGene) with exon hit 0.066 

306249 AA933840 EST singleton (not In UniGene) with exon hit 0.066 

335018 CH22_FGENES.474_6 0.066 

333594 CH22_FGENESi10_3 0.066 

333900 CH22_FGENES.293_7 0.066 

45 325207 CH.1O_hsgi|6552430 0.067 

329888 CH.15_p2 gi|6067149 0.067 

326238 CH.17Jlsgi|5867260 0.067 

333858 CH22_FGENES241_4 0.067 

335809 CH22_FGENES.617_6 0.068 

50 307427 AI243437 EST singleton (not in UniGene) with exon hit 0.068 

318428 AI949409 Hs.224583 ESTs 0.069 

327005 CH.21_hsgI|5867664 0.069 

330463 HG998-HT998 Suifotransferase, Phenol-Preferring ' 0.069 

333318 CH22_.FGENES.138 10 0.07 

55 333313 CH22_FGENES.138_5 0.07 

325937 CH.16_hsgi|5867132 0.07 

335863 CH22 FGENES.590 14 0.07 

335349 CH22_FGENES.539J2 0.07 

303396 AA224470 Hs.25426 ESTs; Weakly similar to unknown [H^apiens] 0.07 

60 332603 N66681 Hs.33470 ESTs 0.07 

333310 CH22_FGENES.138_2 0.071 

309924 AW340812 EST singleton (not in UniGene) with exon hit 0.071 

336340 CH22_FGENES.814_15 0.071 

308025 AI453365 Hs. 172928 collagen; type I; alpha 1 0.071 

65 306805 A1055966 EST singleton (not in UniGene) with exon hit 0.071 

335499 CH22.FGENES.571 8 0.071 

329669 CH.14_p2gi|6272129 0.071 

321666 D28390 EST cluster (not in UniGene) 0.071 

338174 CH22_EM:AC005500.GENSCAN219-2 0.072 
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305451 AA738105 Hs.140 



326943 
333947 
333214 
331917 
339102 
328122 
332250 
328506 
331756 
335193 
317729 
304515 
313644 
326145 
336394 
306516 
300629 

333160 
337490 
305403 
331747 
332792 
330513 
308905 
337419 
333459 
334851 
329046 
327879 
305830 
302928 
304321 
326390 
335230 
334622 
335331 
304753 
301863 
336561 
335811 
305060 
306051 
308289 
334365 
335496 
332634 



CH22JH3ENES.842J 0.072 

Immunoglobulin gamma 3 (Gm marker) 0.072 

CH22J=GENES.46-1 0072 

CH.21Jsgt|6004446 0,073 

CH22_FGENES.303_1 0,074 

CH22_FGENES.104„5 0.074 
AA446572 Hs.174007 ESTs; Moderately similar to III! ALU SUBFAMILY J WARNING 0.074 

CH22J)A59H18.GENSCAN.44-9 0.074 

CH.06_hsgI|5868031 0.075 

KIAA0618 gene product 0.075 

CH.07_hsgi|5868471 0.075 

ESTs 0.075 

CH22.FGENES.507 8 0.076 

ESTs 0.076 

hemoglobin; alpha 2 0.076 

ESTs 0.076 

CH.17_hsgi|5867204 0.076 

CH22J=GENES.823_6 0.077 

EST singleton (not in UniGene) with exon hit 0.077 



N62712 

AA291468 

AA971718 
AA458708 
AI565766 



AA989542 
AA152119 



AA723748 
AA281765 

M81057 
AI859636 



AA857665 
AL137719 
AA136698 



AA578840 
A1418863 



AA635771 
AA905130 
AI571211 



S38953 



337824 
335822 
334758 

309641 AW 194230 

333064 

338695 

331809 AA402482 

326138 

328304 

330570 U6Q276 

334305 

335885 

325839 

333531 

330385 AA449749 

323305 AA811351 
331698 Z39929 



Hs.226223 

Hs.98504 

Hs.128141 
Hs.251577 
Hs.124960 



Hs.155101 ATP synthase; H+ transporting; mitochondrial F1 complex; alpha subunit; 

Isoform 1 ; cardiac muscle 0.077 
CH22_FGENES.91_2 0.077 

CH22_FGENES.799-5 0.077 

EST singleton (not In UniGene) with exon hit 0.077 

Hs.193689 ESTs 0077 

CH22_FGENES.3J2 0.078 

Hs. 180884 carboxypeptidase B1 (tissue) 0.078 

Hs.8102 ribosomal protein S20 0.078 

CH22_FGENES.759-4 0.078 

CH22_FGENES.157_8 0.078 

CH22_FGENES.440_3 0.078 

CH.X_hsgI|5868569 0.078 

CH.06_hsgi|5868142 0.079 

EST singleton (not In UniGene) with exon hit 0.079 

EST cluster (not in UniGene) with exon hit 0.079 

Hs.1 13029 ribosomal protein S25 0.079 

CH.19Jisgi|5867340 0.078 

CH22.FGENES.514_2 0.08 

CH22LFGENES.412.6 0.08 

CH22_FGENES.535_4 0.08 

Hs.77961 major histocompatibility complex; dass I; B 0.08 

EST cluster (not In UniGene) with exon hit 0.081 

CH2*_FGENES.842_6 0.081 

CH22_FGENES.583_5 0.081 

EST singleton (not in UniGene) with exon hit 0.081 

EST singleton (not in UniGene) with exon hit 0.082 

EST singleton (not in UniGene) with exon hit 0.082 

CH2£_FGENES.378J3 0.082 

CH22_FGENES571_4 0.082 
Human unidentified gene complementary to P450c21 

gene; partial cds 0.082 

CH22 - EM^C005500.GENSCAN.13-18 0.082 

CH22_FGENES.619_7 - 0.082 

CH22_FGENES.428_7 0.082 

Hs.253100 EST 0.082 

CH22_FGENES.75_7 0.083 

CH22 - EM^C005500.GENSCAN477-25 0.083 

Hs.97312 ESTs 0.083 

CH.17_hsgi|5867203 0.083 

CH.07Jisgi|6004478 0.083 

Hs.165439 arsA (bacterial) arsenite transporter; ATP-bfnding; homolog 1 0.083 

CH22.FGENES.373 8 0.083 

CH22_FGENES.632_3 0.083 

CH.16Jisgi|6552452 0.083 

CH22J=GENES.175_18 0.084 
Hs.31386 ESTs; Highly similar to secreted apoptosis related protein 

1 [Haptens] 0.084 

Hs.25307 Homo sapiens done 2481 2 mRN A sequence 0.084 

Hs.65843 ESTs 0.084 
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335888 CH22_FGENES.633J> 0.084 

306008 AA894390 EST singleton (not In UnlGene) with exon hit 0.084 

334249 CH22_FGENES.365J5 0.084 

318303 AW451197 Hs.113418 ESTs 0.084 

5 330171 CH.02_p2 gi|6648220 0.084 

336662 CH2S_FGENES.4M 0.085 

320506 AI815668 Hs.167476 suc1 -associated neurotrophic factor target 2 

(FGFR signalling adaptor) 0.085 

316974 AI740721 Hs.128292 ESTs 0.085 

10 336492 CH22_FGENES.832_9 0.085 

335750 CH22J=GENES.6Q2J 0.085 

335676 CH22.FGENES.594J 0.086 

336093 CH22JFGENES.691 J. 0.086 

310932 A1933861 Hs.222852 ESTs 0.086 

15 335160 CH22J=GENES.5Q2_4 0.088 

334306 CH22.FGENES.373J 0.086 

334793 CH22_FGENES.433_5 0.086 

333936 CH22_FGENES.301J2 0.087 

336413 CH22_FGENES.823.35 0.087 

20 333775 CH2^.FGENESJ272J 0.087 

335971 CH22J=GENES.652_4 0.087 

301737 AI815981 EST duster (not in UnlGene) with exon hit 0.087 . 

339101 CH22 DA59H18.GENSCAN.4*6 0.087 " 

327612 CH.04_hs gi|6525283 0.087 

25 326241 CH.17Jlsgi|5867260 0.088 

338386 CH22 EM:AC005500.GENSCAN.3314 0.088 

327762 CH.05_hsgil5867961 0.088 

305266 AA679772 EST singleton (not In UnlGene) with exon hit 0.088 

334359 CH22J=GENES.378_4 0.088 

30 335500 CH22.FGENES.571J0 0.088 

329687 CH.14_p2gi|6117856 0.088 

333654 CH22J=GENES.240J 0.088 

324430 AA464018 EST cluster (not In UnlGene) 0.088 

325999 CH.16_hs gi|5867073 0.089 

35 334832 CH22_FGENES.439_1 0.089 

3391 15 CH22J3A59H18.GENSCAN.49-3 0.089 

300896 AI916902 Hs.2 13832 ESTs 0.089 

328784 CH.07_hs #868309 0.089 

335044 CH22.FGENES.480J 0.089 

40 329791 CH.14j)2 gi|6469354 0.089 

333656 CH2^_FGENES240J 0,089 

326180 CH.17_hsgi|5887211 0.089 

333391 CH22_FGENES.144_6 0.089 

338324 CH22_EM^\C005500.GENSCAN^06-3 0.089 

45 305396 AA721052 EST singleton (not In UnlGene) with exon hit 0.089 

337483 CH22 FGENES.795-7 0.09 

326424 CH.19_hs gt{5867369 0.09 

306454 AA977992 EST singleton (not in UniGene) with exon hit 0.09 

336893 CH22 DJ32I10.GENSCAN.7-6 0.09 

50 327470 CH.GOi$ gl|5887772 0.09 

333165 CH22 FGENES.91._7 0.09 

307155 AI186738 Hs.182426 ribosomal protein S2 0.09 

330717 AA233926 Hs.23635 ESTs - 0.09 

335334 CH22_FGENES.535J0 0.09 

55 335907 CH22_FGENES.636_2 0.09 

333885 CH22_FGENES.292_7 0.09 

331034 N51868 Hs.31965 ESTs; Moderately similar to 40S RIBOSOMAL. 

PROTEINS20(H.sapiens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 CH.06_hsgq5868096 0.091 

336068 CH22_FGENES.664J3 0.091 

302833 AA295381 Hs.44423 ESTs 0.091 

328668 CH.07_hsgl|5868254 0.091 

335309 CH22_FGENES.532_2 0.091 

65 338481 CH22.EMAC005500.GENSCAN.377-5 0.091 

306286 AA936892 EST singleton (not In UniGene) with exon hit 0.091 

305070 AA639783 EST singleton (not In UniGene) with exon hit 0.091 

304870 AA594811 Hs.119122 ribosomal protein L1 3a 0.091 

303856 AA968589 Ms.944 glucose phosphate isomerase 0.091 
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323789 AI459812 Hs.170460 ESTs; Weakly similar to KIAA099O protein [H.saplens] 0.092 

334910 CH22..FGENES.455_3 0.092 

326382 CH.19_hs gi|5867327 0.092 

332467 AA489630 Hs.1 19004 KIAA0665 gene product 0.092 

5 338534 CH22_EMAC005500.GENSCAN.402-7 0.092 

336449 CH22_FGENES.829_6 0.092 

333709 CH22_FGEKES.250^4 0.092 

336559 CH22_FGENES.842_4 0.092 

333230 CH22 - FGENES.107_10 0.093 

10 333133 CH2£_FGENES.83_9 0.093 

334885 CH22J=GENES.451_11 0.093 

330605 X02419 Hs.77274 plasmfoogen activator; urokinase 0.093 

338392 CH22_FGBIES.823_4 0.093 

334083 CH22_FGBiES.327_38 0.093 

15 325469 CH.iajis gi[6017034 0.093 

331077 R09531 Hs.1 9039 ESTs 0.093 

303701 AW500732 EST cluster (not in UniGene) with exon hit 0.093 

334218 CH22_FGENES.358_3 0.093 

336542 CH22_.FGENES.840J 0.093 

20 337151 CH22__FGENES.546-1 0.093 

333642 CH22 J=GENES231J 0.093 

336863 CH22.FGENES.297-4 0.093 _ 

334680 CH22_.FGENES.419J 0.093 

326365 CH.18jisgil5867297 0.093 

25 338952 CH22J3J32I10.GENSCAN.23-22 0.093 

337539 CH22L.FGENES.832-4 0.094 

333546 CH22_FGENES.180_2 0.094 

335258 CH22__FGENES.5ie_3 0.094 

336788 CH22.FGENES.168-19 0.094 

30 321644 AI204177 Hs.237396 ESTs 0.094 

335943 CH2^FGENES.648J7 0.094 

327918 CH.06_hsgil5868165 0.094 

306398 AA970548 EST singieton (not In UniGene) with exon hit 0.094 

335671 CH2ZJH3ENES.592.3 0.094 

35 335033 CH22_FGENES.475_11 0.094 

338277 CH22_EM:AC005500.GENSCAN.290-2 0.094 

332061 AA504812 Hs.1 92824 early B-ceO factor 0.094 

305153 AA654582 Hs.77039 ribosomal protein S3A 0.094 

333880 CH22J=GENES.292_2 0.094 

40 323940 AI864428 Hs.170880 ESTs 0.094 

313779 AA648796 Hs.129771 ESTs 0.095 

323109 AA169345 EST cluster (not in UniGene) 0.095 

332930 CH22_FGENES.38J 0.095 

335368 CH22J=GENES.543_6 0.095 

45 303887 R72672 Hs.193484 ESTs; Weakly similar to Similarity with yeast gene 

L3502.1 [Oelegans] 0.095 

336223 CH22_.FGENES.727_3 0.095 

31 1280 A1767957 Hs.1 97737 ESTs; Weakly similar to Y38A8.1 gene product [Celegans] 0.095 

337256 CH2^FGENES.648-3 0.095 

50 308814 AI819263 EST singieton (not In UniGene) with exon hit 0.095 

334659 Cr^FGENES^ISJ 0.095 

335895 CH22_FGENE$.635_3 0.095 

321697 AW388061 Hs.4953 golgi autoantlgen; goigin subfamily a; 3 - 0.095 

336010 CH22J=GENES.668_8 0.096 

55 302824 U21260 EST cluster (not in UniGene) with exon hit 0.096 

333612 CH22_.FGENES.217_J 0.096 

304823 AA584837 EST singleton (not in UniGene) with exon hit 0.096 

335665 CH22_FGENES.590J6 0.096 

306518 AA989598 EST singleton (not In UniGene) with exon hit 0.096 

60 335243 CH22_.FGENES.5t6J 0.096 

335436 CH22_FGENES.559J5 0.096 

300243 AI420256 Hs.161271 ESTs 0.096 

332810 CH2^FGENES.7_12 0.097 

308612 AI735634 EST singleton (not In UniGene) with exon hit 0,097 

65 335818 CH22_FGENES.618_6 0.097 

325838 CH.16JIS gi|6552452 0.097 

337482 CH22J=GENES.795-6 0.097 

336645 CH22_FGENES-26-1 0.097 

337293 CH22.FGENES.675-1 0.098 
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329893 CH.15_p2gl|6525313 0.098 

326533 CH.19_hsgi|5867441 0.098 

334905 CH22.FGENES.452J0 0.098 
306347 AA961144 EST singleton (not in UniGene) with exon hit 0.098 

5 336676 CH22_.FGENES.43-4 0.098 

339166 CH22 DA59H18.GENSCAN.69-7 0.098 

335774 CH2JLFGENES.607 J 0 0.098 

339216 CH22J : F113D11.GENSCAN.6-11 0.098 

33531 1 CH22_FGENES.532_4 0.098 

10 329632 CH.11_p2gi]6729060 0.098 

328595 CH.07Jtsgi|5868224 0.098 

326928 CH-21_hsgi|6456782 0.098 

315234 AI07968O Hs.120770 ESTs 0.098 

306082 AA908508 EST singleton (not in UniGsne) with exon hit 0.093 

IS 305710 AA826544 EST slngteton (not In UniGene) with exon hit 0.098 

318540 T30280 EST cluster (not in UniGene) 0.099 

337553 CH22_C4G1.GENSCAN.2-1 0.099 

320951 AA344069 Hs.202699 neurexophilin 4 0.099 

303645 T08O33 EST duster (not in UniGene) with exon hit 0.099 

20 338981 CH22_DA59H18.GENSCAN.2-5 0.099 

321313 R87365 Hs.26058 ESTs; Weakly similar to p532 [H.sapiensj 0.099 

328348 CH.07_hsgi|5866383 0.099 

332203 H49388 Hs. 102082 EST 0.099 ~ 

301780 R07064 EST duster (not In UniGene) with exon hit 0X199 

25 332095 AA608838 Hs.162681 EST 0.099 

333227 CH22_FGENES.107_5 0.099 

316442 AA760894 Hs.153Q23 ESTs 0.099 

326001 CH.16Jsg!|5867073 0.099 

334363 CH22.FGENES.378J1 0.099 

30 338895 CH22_DJ32I1 0.GENSCAN.9-2 0.099 

327460 CH.02jisgi|6004455 0.099 

332705 T59161 Hs.76293 thymosin; beta 10 0.1 

307806 AI351739 EST singleton (not in UniGene) with exon hR 0.1 

322800 F25037 Hs.225175 ESTs 0.1 

35 304918 AA602697 EST singleton (not In UniGene) with exon hit 0.1 

334327 CH22_FGENES.375_4 0.1 

318359 AI097439 Hs.135548 ESTs 0.1 

326644 CH.20Jsgi|5867559 0.1 

334454 CH22J=GENES.388_3 0.1 

40 327959 CH.06_hsgi|5868210 0.1 

323783 AA330586 Hs.131819 ESTs 0.1 

309198 AI955915 Hs.248038 major histocompatibility complex; dass I; C 0.1 

339265 CH22_BA354l12.GENSCAN.10-3 0.1 

320576 AL049977 Hs.1 62209 Homo sapiens mRNA; cONA DKFZp564C122 

45 (from done DKFZD564C122) 0.1 

338132 CH22_EMAC005500.GENSCANi»0-2 0.1 

333163 CH22_FGENES.91_5 0.101 

337584 CH22_C20H12.GENSCAN.5-1 0.101 

307588 AI285535 EST singleton (not in UniGene) with exon hit 0.101 

50 336969 CH22LFGENES.378-2 0.101 

327535 CH.02_hsgi|6525279 0.101 

328732 CH.07„hsgi|5868289 0.101 

336686 CH22 FGENES.46-3 * 0.101 

335777 CH22.FGENES.607J3 0.101 

55 332944 CH22J=G_NES.47_3 0.101 

333174 CH22.FGENES.95J 0.101 

336380 CH22_FGENES.821_8 0.101 

330571 U60800 Hs.79089 sema domain; immunoglobuHn domain (Ig); 

cytoplasmic domain; (semaphorin) 40 0.101 

60 331789 AA398721 Hs.186749 ESTs 0.101 

338915 CH22 DJ32I10.GENSCAN.12-1 0.101 

334844 CH22 FGENES.439_24 0.101 

336642 CH22 FGENES.23-4 0.101 

334906 CH22 - FGENES,452_21 0.101 
65 333188 CH22_FGENES,98_8 0.101 

300088 AW299993 EST duster (not In UniGene) with exon hit 0.101 

329373 CHX_hsgi[6682537 0.102 

331120 R46576 Hs.23239 ESTs 0.102 

335856 CH22_FGENES.628J 0.102 
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331888 AA431337 Hs.98017 ESTs 0.102 

333154 CH2_LFGENES.89j4 0.102 

335989 CH22J=GENES.655_2 0.102 

304385 AA235602 EST singleton (not in UniGene) with exon hit 0.102 

5 338016 CH22_EM^C005500.GENSCAN.133-1 0.102 

335190 CH22_FGENES.507_5 0.102 

318595 T39486 Hs.6137 ESTs 0.102 

333697 CH22.FGENES.250J1 0.102 

306526 AA989713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 CK07_hsgip86B289 0.103 

307294 AI205612 Hs.73742 ribosomaJ protein; large; PO 0.103 

327424 CH.02_hs gi|5867751 " 0.103 

335872 CH2a_FGENE3.630_3 0.103 

333572 CH22 FGENES.189J 0.103 

15 334774 CH22J=GENES.430_6 0,103 

338660 CH22_EMAC005500,QENSCAN.462-1 0,103 

326713 CH.20J1S 81)5867595 0.103 

333994 CH22_FGENES.310J8 0.103 

335800 CH22_FGENES.613_4 0.103 

20 318113 AI187943 Hs.132322 ESTs 0.103 

337278 CH22_FGENES.665-1 0.103 

336386 CH22 FGENES.822_6 0.103 

334790 CH22J=GENES.432J5 0.103 ~ 

303778 AW505368 EST cluster (not in UniGene) with exon hit 0.104 

25 336524 CH22_FGENES.839._5 0.104 

328936 CH.08 hsgi|5868500 0.104 

335102 CH2-LFGENES.494.7 0.104 

300935 AA513644 Hs.222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein [Haptens] 0.104 

30 307581 AI284415 EST singleton (not in UniGene) with exon hit 0.104 

317301 AW291683 Hs.226056 ESTs 0.104 

335330 CH22_FGENES.535J 0.104 

337968 CH22_EM_AC005500.GENSCAN.103-2 0.104 

335627 CH22J=GENES584_7 0.104 

35 336274 CH22_FGENES.762J2 0.104 

334730 CH22_FGENES.424_5 0.105 

334409 CK22_FGENES.383_6 0.105 

327237 CH.01_hs gi|5867544 0.105 

333321 CH22.FGENES.138J3 0.105 

40 303181 AA452366 EST cluster (not in UniGene) With exon hit 0.105 

333738 CH22_.FGENES.26U 0.105 

338255 CH22_EMAC005500.GENSCAN^76-3 0.105 

334282 CH22.FGENES.369J2 0.105 

330190 CH.05JJ2 #165182 0.105 

45 310748 AW014249 Hs.158698 ESTs 0.105 

338150 CH22 EMAC005500.GENSCAN.207-2 0.105 

336719 CH22.FGENES.82-6 0,105 

330228 CH.05_p2gl|6013527 0.105 

327801 CH.O5_hsgi|5867024 0.105 

50 330525 S75168 Hs_274 megakaryocyte-assoclated tyrosine kinase 0.105 

334972 CH22_FGENES.468_2 0.105 

3351 1 1 CH22LFGENES.494J9 0.106 

334483 CH22_FGENES.395_5 " 0.108 

328829 CH.07_hsgll5868337 0.106 

55 302753 M74299 EST cluster (not in UniGene) with exon hit 0.106 

334512 CH22_FGENES.398J0 0.108 

330024 CH.16_p2giI6671908 0.106 

321030 AI769930 Hs.233617 Homo sapiens (clone B3B3E1 3) Huntington's 

disease candidate region 0.107 

60 338410 CH22_EMAC005500.GENSCAN.341-6 0,107 

334353 CH22_FGENES.376_5 0.107 

338278 CH22 EM:AC0Q550O.GENSCAN.288-9 0.107 

329053 CH.XJ1S gil5868574 0.107 

336560 CH22.FGENES.842.5 0.107 

65 332158 AA621363 Hs.1 12980 EST 0.107 

336447 CH22_FGENES.829_4 0.107 

333703 CH22_FGENES-250J7 0.107 

326207 CH.17_hsgl|5867222 0.107 

333232 CH22L.FGENES.108J 0.107 
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334802 CH22_FGENES.435_1 0.107 

303784 AA704983 EST cluster (not in UniGene) with exon hit 0.107 

338847 CH22_OJ246D7.GENSCAN.10-2 0.107 

339407 CH22_DJ579N16.GENSCAN.1-9 0.108 

5 337635 CH22_C20H12.GENSCAN.32-8 0.108 

334650 CH2a.FGENES.417_17 0.108 

308511 AI687580 EST singleton (not in UniGene) with exon hft 0.108 

333392 CH22_FGENES.144_8 0.108 

325840 CH.16_hsgl)6552452 0.108 

10 315044 AW205664 Hs.129568 ESTs 0.108 

333298 CH22_FGENES.133_4 0.108 

335157 CH22J=GENES.501_7 0.108 

333305 CH22_FGENES.137_2 0.108 

326379 CH.19_hsgl|5867327 0.108 

15 335050 CH22J=GENES.482_1 0.108 

305185 AA663985 Hs£48038 major histocompatibility complex; class I; C 0.108 

335658 CH22 FGENES.590_9 0.108 

323040 AA336609 Hs.10862 ESTs 0.108 

337326 CH2£_FGENES.699-6 0.108 

20 339262 CH22_BA354l12.GENSCAN.9-6 0.108 

321202 H54052 Hs.163639 ESTs; Weakly similar to INTERCELLULAR ADHESION 

MOLECULE-1 PRECURSOR [H.sapiens] 0.109 

331792 AA398968 Hs£7548 EST 0.109 

333806 CH22_FGENES578_2 0.109 

25 321325 AB033100 EST cluster (not in UniGene) 0.109 

331373 AA435513 Hs.178170 ESTs; Weakly similar to DUAL SPECIFICITY 

PROTEIN PHOSPHATASE 3 037 

328775 CH.07_.hs 01(5868309 0.109 

335105 CH22_FGENES.494_10 0.109 

30 300975 AI283548 Hs.149668 ESTs 0.109 

324893 T31940 EST cluster (not in UniGene) 0.109 

333397 CH22.FGENES.144J5 0.109 

336484 CH22_FGENES.831_3 0.109 

335507 CH22_FGENES.571_22 0.109 

35 336373 CH22_FGENES.820_3 0.109 

336188 CH22_FGENES.717_12 0.109 

313455 AW081702 Hs.137329 ESTs 0.109 

335185 CH22_FGENES.506_4 0.109 

306814 AI066577 EST singleton (not in UniGene) with exon hit 0.109 

40 311130 AI632322 Hs. 195306 ESTs 0.109 

310882 AW080339 Hs.211911 ESTs 0.109 

323383 AI346359 Hs. 135209 ESTs 0.11 

300212 AW135925 Hs. 184552 biphenylhydrolaso-Tike (serine hydrolase; breast epttfieCal 

mudn-assoc 0.11 

45 325675 CH.14 hs gi|5867014 0.1 1 

330095 CR19_p2g1|6015278 0.11 

331942 AA453261 Hs.99309 ESTs 0.11 

334723 CH22_FGENES<421_34 0.11 

333614 CH22 FGENES.217_9 0.11 

50 337316 CH2£JGENES.692-1 0.1 1 

305057 M635626 Hs.62954 ferritin; heavy polypeptide 1 0.11 

338704 CH2?_EM:AC005500.GENSCAN.480-3 0.1 1 

335385 CH22_FGENES.543_27 - 0.11 

338012 CH22_EMAC005500.GENSCAN.128-10 0.11 

55 329449 CH.Y_hsg!|5868886 0.11 

338980 CH22_DA59H18.GENSCAN.2-4 0.1 1 

336553 CH2a.FGENES.84U0 0.111 

330021 CH.16_p2 #671889 0.111 

327579 CH.03„hsg1|5867824 0.111 

60 333099 CH22_FGENES.79_4 0.111 

337076 CH2a.FGENES.4534 0.111 

331388 AA456852 Hs.43543 suppressor of white apricot homolog 2 0.111 

306674 AI005542 Hs.180414 heat shock 70kD protein 10 (HSC71) 0.111 

305949 AA884409 EST singleton (not in UniGene) with exon hit 0.111 

65 330748 AA419217 Hs.15911 OKFZP586E1422 protein 0.111 

333780 CH22_FGENES.273_2 0.111 

323576 AI702835 EST cluster (not in UniGene) 0.111 

308952 A1868157 Hs.224226 EST 0.111 

309338 AW026946 Hs.181165 eukaryotic translation etongation factor 1 alpha 1 0.111 
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328317 CHJUisgi|6381976 0.112 

333518 CH22_.FGENES.173J 0.1 12 

306982 AM 27883 EST slngteton (not in UniGene) wilh exon hit 0.112 

338225 CH2^FGENES.728J 0.1 12 

5 333698 CH22_FGENES.250J2 0.112 

302173 AI417947 Hs.14068 ESTs 0.112 

335510 CH22_FGENES.571J5 0.1 12 

328042 CH.06_hsgi|5902482 0.112 

338512 CH22 FGENES.834J 0.112 

10 328541 CH.07_hsgi|5868486 0.112 

311265 AW205118 Hs.199214 ESTs 0.112 

323218 AF131846 Hs.13396 Homo sapiens done 25028 mRN A sequence 0.112 

302002 AF0 13956 Hs.123085 chromobox homolog 4 (Drosophila Pc class) 0.112 

315088 AA557351 Hs.152448 ESTs; Moderately similar to MULTIFUNCTIONAL PROTEIN ADE2 0.112 

15 312581 AI937242 Hs.176590 ESTs 0.112 

322246 AW384710 Hs.125258 ESTs 0.112 

333659 CH22_FGENES,241J 0.113 

327510 CH.02_hsgI|6117815 0.113 

336520 CH22.FGENES339J 0.113 

20 338682 CH22.EMAC005500.GENSCAN.472-1 0.113 

334508 CH22_FGENES.398_e 0.113 

322533 T59538 EST duster (not In UniGene) 0.113 

306873 AI086929 EST singleton (not In UniGene) with exon hit 0.113 

336040 CH22_FGENES.679_2 0.113 

25 303898 T23215 EST duster (not In UniGene) with exon hit 0.113 

312011 AW294868 Hs.187226 ESTs 0.113 

335186 CH22_FGENES.506_5 0.113 

333607 CH22_FGENES.216J 0.1 13 

305549 AA773530 EST singleton (not In UniGene) with exon hit 0.113 

30 333686 CH22_FGENES.249_4 0.113 

334352 CH22.FGENES.376J 0.113 

338195 CH22_EM:AC005500.GENSCAN.233-18 0.114 

333588 CH22.FGENES206J 0.1 14 

339233 CH22.BA354I12.GENSCAN5-3 0.114 

35 337455 CH22.FGENES.777-1 0.114 

309101 AI925108 EST singleton (not In UniGene) with exon hit 0.114 

328522 CR07_hs gi|5868477 0.1 14 

323999 A1537333 Hs.252782 ESTs 0.114 

333517 CH22J : GENES.173J> 0.114 

40 329935 CH.16_j>2 gl|6165200 0.114 

326226 CH.17_hs gi|5867230 0.114 

335890 CH22J=GENES.633_4 0.1 14 

336715 CH22.FGENES.77-1 0.114 

327640 CH.04J»sgi|5867890 0.114 

45 338842 CH22_DJ246D7.GENSCAN.7-1 0.114 

306534 AA991487 EST slngteton (not 6i UniGene) with exon hit 0.114 

336597 CH22J=GENES.266J 0.114 

321010 Y17456 Hs227150 Homo sapiens LSFR2 gene; last exon 0.114 

302294 AA159213 Hs.5337 isodtrate dehydrogenase 2 (NADP+); mitochondrial 0.114 

50 324895 N44238 Hs.77515 Inositol 1;4;5-triphosphate receptor; type 3 0.114 

327358 CH.01_hsgil6552411 0.114 

308792 AI815153 Hs.195188 glyceraldehyde^-phosphate dehydrogenase 0.115 

325886 CH,16_hS gi|5867087 • 0.1 15 

336850 CH22_FGENES_>72-11 0.115 

55 305858 AA863103 EST slngteton (not In UniGene) with exon hit 0.115 

302569 AC004472 multiple UniGene matches 0.115 

336158 CH22.FGENES.707J 0.115 

327868 CH.06_hs gi|5868131 0.1 15 

339157 CH22 DA59H18.GENSCAN.67-3 0.115 

60 339258 CH22_BA354H2.GENSCAN.8-3 0.115 

336129 CH22 FGENES.701J7 0.115 

333684 CH22.FGENES.249JJ 0.115 

309618 AW190162 Hs. 164776 ribosomal protein L23a 0.115 

312926 AA954097 Hs.127523 ESTs 0.115 

65 302640 AB035698 EST duster (not In UniGene) with exon hit 0.115 

328968 CH.08_.hs gi|6458775 0.1 15 

327902 CH.08Jis gi|5868158 0.1 15 

321927 AJ223366 EST duster (not in UniGene) 0.115 

335962 CH22_FGENES,651j4 0.115 
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334927 


CH22.FGENES.460J 


0.115 


330535 U 11872 


Human tnterteukln-8 receptor type B (IL8RB) mRNA, 






splice variant IL8RB1 


0.858 


328591 


CH.07Jlsgi[5868227 


0.115 


334902 


CH22_FQBIES.452.16 


0.115 


326525 


CR07_hsgi|5868482 


0.115 


325870 


CH.16_hsgi|6682492 


0.116 


337522 


CH22_FGENES.819-1 


0.116 


305079 AA641329 


EST singleton (not in UniGene) with exon hit 


0.116 


327343 


CH.01 hsgi]6017017 


0.116 


333918 


CH22_FGENES.296_7 


0.116 


333600 


CH22 FGENES.213J> 


0.116 


335846 


CH22_FGENES.623j8 


0.116 


333510 


CH22 FGENES.171 J 


0.116 


327629 


CH.04_hsgi|5867872 


0.116 


333470 


CH22J : GBiES.161_6 


0.116 


326855 


CH.20 hsgi|6552460 


0.118 


327008 


CH.21_hsg!|5867664 


0.117 


337480 


CH22.FGENES.795-3 


0.117 


338425 


CH22 FGENES.B24 10 


0.117 


321964 AL079587 


Hs.171065 ESTs 


0.117 


335651 


CH22_FGENES.590_2 


0.117 


308164 A1521574 


Hs.181 165 eukaryotic translation elongation factor 1 alpha 1 


0.117 


337927 


CH22_EM:AC005500.GENSCAN.80-3 


0.117 


300341 H45095 


Hs.153524 ESTs 


0.117 


300154 AI245127 


Hs.179331 ESTs 


0.117 


308295 AA937331 


EST singleton (not in UniGene) with exon hit 


0.117 


329670 


CH.14_p2gi|6272129 


0.117 


335612 


CH22_FGENES.583_6 


0.117 


307845 A1363450 


EST singleton (not in UniGene) with exon hit 


0.117 


330401 D28383 


Human mRNA for ATP synthase B chain, 5'UTR (sequence from the 




5'cap to the start codon) 


0.117 


327127 


CH.21_hsgi|6682520 


0.117 


333843 


CH22 FGENES.290J 


0.117 


331083 R17762 


Hs.22292 ESTs 


0.117 


329140 


CHJC.hsgi|6017060 


0.117 


339338 


CH22_BA354U2.GENSCAN.27-3 


0.117 


331974 AA464518 


Hs.99616 ESTs 


0.117 


338631 


CH22_EM:AC005500.GENSCAN.454-2 


0.117 


330299 


CH.06_p2gi|29058B1 


0.117 


330351 


CH.09_p2gi|3056622 


0.117 


305377 AA7157U 

VWw< * iw« IV* • ~ 


Hs.181357 laminin receptor 1 (67kD; ribosomaJ protein SA) 


0.117 


333106 


CH2^J=GENES.79_12 


0.117 


3385U 


CK22 W EM-^C005500.GENSCAN^924 


0.117 


327335 


CH.01_hsgi|5902477 


0.117 


301970 AB028962 


Hs.120245 KIAA1039 protein 


0.118 


326339 


CH.17_hsgi|6056311 


0.118 


330612 X15673 


Hs.93174 Human endogenous retrovirus pHE.1 (ERV9) 


0.118 


334178 


CH22_FGENES.350„6 


0.118 


328008 


CH.06Jlsgi|5902482 


0.118 


<*2fl976 

u£oorU 


CH.16„p2gij4878063 


0.118 


320952 AA897432 


Hs.130411 ESTs 


0.118 


3Q5B21 AA7S9095 


EST singleton (not in UniGene) with exon hit 


0.118 


337850 


CH22 EM:AC005500.GENSCAN.34-3 


0.118 


333626 


CH22_FGENES.224_2 


0.118 


337672 


CH22_EMAC000097.GENSCAN.67-1 


0.118 


328803 


CH.07 hsgi|6004475 


0.118 


325922 


CH.18„hsgi|5887122 


0.118 


334489 


CH22_.FGENES.397_1 


0.118 


320638 R54766 


Hs.101120 ESTs 


0.118 


321932 AA569229 


EST cluster (not in UniGene) 


0.118 


336958 


CH22_FGENES.387-1 


0.118 


332082 AA600176 


Hs.1 12345 ESTs 


0.118 


306004 AA889992 


EST singleton (not in UniGene) with exon hit 


0.118 


336803 


CH22_FGENES.194-1 


0.118 


309107 AI925823 


EST singleton (not in UniGene) with exon hit 


0.118 


336859 


CH22_.FGENES.293* 


0.116 


337935 • 


CH22_EM:AC005500.GENSCAN.85-6 


0.118 


326492 


CH.19_hsgi|5867422 


0.118 
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327289 


CH.01_.hs #867481 


0.119 


325818 


CH.14_hsgi|6682490 


0.119 


310787 AW262580 


Hs.159040 ESTs 


0.119 


330028 


CH.16_p2 #671908 


0.119 


325317 


CH.11_hsgi|5866878 


0.119 


335279 


CH22_FGENES.523_7 


0.119 


331720 AA192173 


Hs.221530 ESTs 


0.119 


329186 


CH.XJisgi|5868711 


0.119 


316012 AA764950 


Hs.119898 ESTs 


0.119 


338316 


CH22_EMAC005500.GENSCAN.304-2 


0.119 


326033 


CH.17_hsgi|5887178 


0.119 


334745 


CH22_FGENES.426_3 


0.119 


333051 


CH22J=GENES.73_5 


0.119 


301763 R01279 


EST cluster (not in UnlGene) with exon hit 


0.12 


304502 AA454609 


Hs. 172928 collagen; type I; alpha 1 


0.12 


335680 


CH22.FGENES.594J5 


0.12 


304678 AA548558 


EST singleton (not in UnlGene) with exon hit 


0.12 


335441 


CH22_FGENES.560_4 


0.12 


336187 


CH22_FGENES.717_11 


0.12 


309422 AW087175 


EST singleton (not In UnlGene) with exon hit 


0.12 


336047 


CH22_FGENES.679_9 


0.12 


309651 AW195850 


EST singleton (not in UniGene) with exon hit 


0.12 


308547 A1695385 


Hs.201903 EST 


0.12 


304443 AA399444 


EST singleton (not in UniGene) with exon hit 


0.12 


336245 


CH22_FGENES.746_3 


0.12 


302703 H72333 


EST cluster (not In UniGene) with exon hit 


0.12 


335690 


CH22_FGENES.596_5 


0.12 


328941 


CH.08_hsgi|6456765 


0.12 


333873 


CH22.FGENES29L9 


0.12 


317246 AW1 05092 


Hs.155690 ESTs 


0.12 


339288 


CH22_BA354l12.GENSCAN.16-6 


0.12 


337996 


CH22_EM:AC005500.GENSCAN.1 16-3 


0.12 


333304 


CH22J=GENES.137J 


0.121 


308332 AI591235 


EST singleton (not In UniGene) with exon hit 


0.121 


329319 


CH.X_hsgi|6381976 


0.121 


302086 X57138 


multiple UniGene matches 


0.121 


333290 


CH22_FGENES.129_2 


0.121 


323825 AI793080 


Hs.123525 ESTs; Weakly similar to NEUTROPHIL GELATINASE-ASSOOATED 




UPOCAUN PRECURSOR [R.norvegicusl 


0.121 


330575 U64105 


Hs.252280 Rho guanine nucleotide exchange factor (GEF) 1 


0.121 


305274 AA679990 


Hs.181 165 eukaryotte translation elongation factor 1 alpha 1 


0.121 


333647 


CH22_FGENES.235_2 


0.121 


302251 AA333340 


EST cluster (not in UnlGene) with exon hit 


0.121 


329777 


CH.14_p2gf|60G2090 


0.121 


333155 


CH22_FGENES.89_5 


0.121 


326122 


CH.17Js #867194 


0.121 


335310 


CH22_FGENES.532_3 


0.121 


335453 


CH22.FGENES.562J3 


0.122 


305103 AA643329 


Hs.111334 ferritin; light polypeptide 


0.122 


337284 


CH22.FGENES.667-2 


0.122 


337418 


CH22.FGENES.758-4 


0.122 


313073 AI963740 


Hs.46826 ESTs 


0.122 


303759 AW504164 


EST cluster (not in UniGene) with exon hit 


0.122 


300017 




0.122 


M33197 


AFFX control: GAPDH 


316725 AW135084 


Hs.127264 ESTs 


0.122 


330738 AA293153 


Hs.120980 nuclear receptor co-represser 2 


0.122 


336466 


CH22_FGENES.829_25 


0.122 


335956 


CH22 FGENES.647.3 


0.122 


315308 AA780564 


Hs.1 89053 ESTs 


0.122 


338925 


CH22_DJ32l10.GENSCAN.14-3 


0.122 


334969 


CH22_FGENES.466_2 


0.122 


322050 AL137589 


EST cluster (not In UniGene) 


0.122 


339084 


CH22 DA59H18.GENSCAN58-2 


0.122 


338323 


CH22_EM:AC005500.GENSCAN.308-2 


0.122 


337003 


CH22_FGENES.419-7 


0.122 


325470 


CH.12_hs #017034 


0.123 


336503 


CH22.FGENES.833J0 


0.123 


330786 060374 


Hs£58712 EST 


0.123 
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65 



329446 


CH.YJsgl|5868886 


0.123 


303328 AA229433 


H&222634 EST s; Moderatety similar to ublquHin-like protein / 






ribosomal protein S30 


0.123 


309067 AI916313 


Hs.212788 EST 


0.123 


317464 AA968472 


Hs.130463 ESTs 


0.123 


328755 


CH.07J1S gi|5668301 


0.123 


326036 


CH.17Jisgl|5867178 


0.123 


327208 


CH.01 hsgl|5867447 


0.123 


326124 


CH.17_hsgi|5916395 


0.123 


327509 


CH.0ejisgl|6117815 


0.123 


338398 


CH22_EM:AC005500.GENSCAN.336-5 


0,123 


304652 AA527782 


Hs.84298 CD74 antigen (invariant polypeptide of major 






histocompatibility complex; class II antigen-associated) 


0.123 


335797 


CH22_FGENES.612_3 


0.124 


336714 


CH22.FGENESJ6-29 


0.124 


327204 


CH.01_hsgi|5867447 


0.124 


331881 AA430872 


Hs.123778 ESTs 


0,124 


306971 AI126509 


EST singleton (not in UniGene) with exon hit 


0.124 


336174 


CH22 FGENES.710J 


0.124 


336126 


CH22 FQENES.701 13 


0.124 


329129 


CHJC_hsgi|6588026 


0.124 


303049 AW407562 


EST cluster (not In UnlGene) with exon hit 


0.124 


335778 


CH22J=GENES.607J4 


0.124 ~ 


336601 


CH22~FGENES.369_2 


0.124 


334340 


CH22_FGENES.375J7 


0.124 


337436 


CH22_FGENES.767-1 


0.124 


306013 AA896990 


EST singleton (not in UniGene) with exon hit 


0.124 


339213 


CH22J=F1 13D1 1 .GENSCAN.6-8 


0.124 


335355 


CH22_FGENES.541_2 


0.124 


336552 


CH22 FGENES.841 9 


0.124 


336384 


CH22_FGENES.822_4 


0.124 


310485 AI286202 


Hs.149800 ESTs" 


0.125 


335840 


CH22_FGENES.622„3 


0.125 


336444 


CH22_FGENES.827J 0 


0.125 


315703 N36070 


EST cluster (not in UniGene) 


0.125 


327763 


CH.G5_hs gl|5867961 


0.125 


336383 


CH22 FGENES.822 3 


0.125 


333496 


CH22 FGENES.168 6 

VI Ifafa _ 1 Vial wwi ivw^y 


0.125 


328662 


CH.07_hs gi|6004473 


0.125 


338988 


CH22 DA59H18.GENSCAN.5-1 


0.125 


328311 


CH.07Jis 0,1(5868371 


0.125 


337241 


CH22J=GENES.644-2 


0.125 


336933 


CH22_FGENES.350-7 


0.125 


313483 AW294432 


Hs.144252 ESTs" 


0.125 


326116 


CH.17 hsdil5667193 


0.125 


330450 HG363-HT363 


Eoidarmal Growth Factor Raceotor-Related Protein 


0,125 


307491 AI26B539 


EST filnnlfllnn fnot In UniGena^ with exon hit 


0.125 


331852 AA418988 


Hs.6831 4 Homo sapiens mRN A; cDNA DKFZp586L0120 






ffrom dons DKFZD5B6L0120i 


0.125 


330462 HG844-HTS44 


flnnflmlna Rflmntor 04 


0.125 




per einntotnn fnnt In 1 tnEftona) with avnn hit 


0,125 


336385 


CH22 FGFNES822 5 


0,125 


336793 


CH22 FGENES 176*3 


0.125 


326243 


GH 17 hs afl5667261 


0.125 


327266 


CHX)1_hsgi|5B67462 


0.125 


320753 AF070579 


Hs.1 81 544 Homo sapiens clone 24487 mRNA sequence 


0.125 


336960 


CH22J=GENES.369-5 


0.125 


329667 


CH.14j)2gi|6272129 


0.125 


328163 


CH.06Jisgi|5868071 


0.125 


336534 


CH22.FGENES.839J6 


0.125 


339289 


CH22_BA354l12.GENSCAN.16-9 


0.128 


309230 AI970747 


EST singleton (not in UniGene) with exon hit 


0.126 


339190 


CH2£J=F113D11.GENSCAN.1-2 


0.126 


337086 


CH22.FGENES.458-14 


0.126 


319233 R21054 


Hs^ 11522 ESTs 


0,126 


339393 


CH22_BA232E17.GENSCAN.6-8 


0.126 


331930 AA449077 


Hs.179765 Homo sapiens mRNA; cDNA DKFZp586H1921 






(from clone DKFZp586H1 92 


0.128 


308099 AI475914 


EST singleton (not in UniGene) with exon hit 


0.126 
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338477 CH22_EM:AC0G5500.GENSCAN.373-5 0.126 

334286 CH22J=GENES.369J6 0.126 

317245 AI025039 Hs.131732 ESTs 0.126 

335249 CH22J=GENES.516_10 0.126 

5 333327 CH22.FGENES.138J0 0.126 

304240 AA0098Q2 EST singleton (not in UniGene) with exon hit 0.128 

335464 CH22„FGENES562J6 0.126 

335236 CH22.FGENES.515J 0.126 

334154 CH22_FGENES.340_4 0.126 

10 309257 AI984183 EST singleton (not In UniQene) with exon hit 0.126 

310015 A1220122 Hs.201981 ESTs; Weakly similar to breast carcinoma-associated antigen 

[Ksaptens] 0.126 

328280 CH.07_hsgi|5868352 0.126 

305744 AA831819 EST singleton (not in UniQene) with exon hit 0.126 

15 327430 CH.Q2_hs #867754 0.126 

328323 CH.07_hs #868373 0.126 

333274 CH22_FGENES.123_2 0.128 

337193 CH22.FGENES.575-3 0.127 

334820 CH2£.FGENES,437_2 0.127 

20 328706 CH.07_hsgi|5868270 0.127 

331228 W67267 Hs.174911 ESTs 0.127 

307205 AI192479 EST singleton (not tn UniQene) with exon hit 0.127 

337123 CH22_FGENES,519-3 0.127 - 

326201 CH.17_hsgi|5867216 0.127 

25 335276 CH22_FGENES.523_2 0.127 



331202 T81115 Hs.191136 ESTs 0.127 

330532 U03187 Hs.121544 interieukin 12 receptor; beta 1 0.127 

321235 N49521 EST duster (not in UniGene) 0.127 

301743 F12605 Hs.204529 ESTs; Weakly similar to reverse transcriptase [H.sapiens] 0.127 

30 328175 CH.06_hsgi|5868073 0.127 

306407 AA971985 EST singleton (not in UniGene) with exon hit 0.127 

327145 CH.01 hsgi|5867548 0.127 

327649 CH.04_hsgi|5867899 0.127 

335142 CH22.FGENES.498J2 0.127 

35 333909 CH22_FGENES.295_2 0.127 

330608 X04325 Hs.2679 gap Junction protein; beta 1;32kD(connexin 32; 

Charoot-Marie-Tooth neuropathy; X-linked) 0.127 

330158 CH^1jp2gi|6580367 0.127 

320153 AF064594 Hs.120360 phosphotlpase A2; group VI 0.127 

40 314407 AA098835 Hs.224432 ESTs 0.127 

333383 CH22JFGENES.143J22 0.127 

320663 AI734242 Hs.244473 ESTs 0.128 

326233 CH.17Jis gi|5867232 0.128 

326598 CH50_hs gl|5867634 0.128 

45 335174 CH22_FGENES.504_4 0.128 

319843 H29920 Hs.99486 ESTs; Weakly similar to aralarl [H.sapiens] 0.128 

335458 CH22.FGENES.562J8 0.128 

332997 CH22.FGENES.58_4 0.128 

334188 CH22.FGENES.352J3 0.128 

50 329759 CH.14_p2 gi|6048280 0.128 

330348 CH.09j)2gi|4544475 0.128 

326958 CH.21J1S #469838 0.128 

305263 AA679467 EST singleton (not in UniGene) with exon hit * 0.128 

337693 CH22.EMAC000097.GENSCAN.78-14 0.128 

55 326812 CH50_hs gi|6682504 0.128 

333237 CH22_FGENES.108_7 0.128 

333699 CH22..FGENES.250J3 0.128 

311498 AI768677 Hs.209888 ESTs; Weakly similar to phosphatidyiserine 

synthase-2 [M.musculus] 0.128 

60 336499 CH22.FGENES.833.4 0.128 

320087 AF032387 Hs.1 13265 small nuclear RNA activating complex; polypeptide 4; 190kD 0.128 

309989 AI184188 Hs.197813 ESTs 0.128 

301490 AW298468 Hs.250461 ESTs 0.128 

337011 CH22 FGENES.427-8 0.128 

65 315052 AA876910 Hs.134427 ESTs 0.128 

301611 W22172 Hs£9038 ESTs 0.128 

336497 CH22_FGENES333_2 0.129 

302068 Y16280 Hs. 1 32049 endothelin type b receptor-like protein 2 0.129 

334502 CH22.FGENES.397J8 0- 129 
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304332 AA158884 EST singleton (not in UniGene) with exon hit 0.129 

304522 AA465405 EST singleton (not in UniGene) with exon hit 0.129 

312407 R46180 Hs.153485 ESTs 0.129 

310098 AI685841 Hs.161354 ESTs 0.129 

5 301119 AF142579 EST cluster (not in UniGene) with exon hit 0.129 

309268 AI985821 Hs.62954 ferritin; heavy polypeptide 1 0.129 

330989 H42142 Hs£26396 DEAD/H (Asp-Glu*AIa*Asp/His) box polypeptide 19 

(Dbp5; yeast homolog) 0.129 

336949 CH22_FGENES.361-4 0.129 

10 330115 CH.19_p2 #015202 0.129 

339212 CH22_FF1 13D1 1 .GENSCAN.8-7 0.129 

326951 CH.21_hsgl|6004446 0.129 

305165 AA662939 EST singleton (not in UniQene) with exon hit 0.129 

308238 AI559492 EST singleton (not in UniGene) with exon hit 0.129 

15 337140 CH^FGENES^-S 0.13 

321758 U29112 EST cluster (not in UniGene) 0.13 

304619 AA515554 Hs, 119598 ribosomal protein L3 0.13 

312469 AA745289 Hs.173088 ESTs 0.13 

339017 CH22_DA59H18.GENSCAN.20-6 0.13 

20 330116 CH.19_p2gi|6015202 0.13 

333312 • CH22_FGENES.138_4 0.13 

338004 CH22_EMAC005500.GENSCAN.12M 0.13 . 

314141 AA232134 Hs.190028 ESTs 0.13 

300509 AI239845 Hs.128494 ESTs; Weakly similar to EG55B72 [D.melanogaster] 0.13 

25 338530 CH22_EMAC005500.GENSCAN.398-1 1 0.13 

335968 CH2£_FGENES.652_1 0.13 

314121 AI732100 Hs.187619 ESTs 0.13 

337593 CH22_C20H12.GENSCAN.6-8 0.13 

332881 CH22.FGENES.33J 0.13 

30 305836 AA858043 EST singleton (not In UniGene) with exon hit ^ 0.13 

339059 CH22_DA59H18.GENSCAN.30* 0.13 

305610 AA782319 EST singleton (not In UniGene) with exon hit 0.13 

305852 AA862455 EST singleton (not En UniGene) with exon hit 0.13 

327409 CHMJns gl|5867750 0.13 

35 312751 AI613089 Hs.164178 ESTs 0.13 

308726 AI799268 Hs209929 EST 0.13 

325961 CH.16_hs gi|5867147 0.13 

311159 AW025919 Hs.197636 ESTs 0.13 

322715 AA057230 Hs.182135 ESTs 0.13 

40 336441 CH22_FGENES.827_7 0.13 

336339 CH22J=GENES.814J2 0.13 

306911 AI095365 EST singleton (not in UniGene) with exon hit 0.13 

333613 CH22_FGENES_17_8 0.13 

338489 CH2a_EMAC005500.GENSCAN.384-17 0.131 

45 326904 CH21Jisgi|5867684 0.131 

337337 CH22_FGENES.717-t 0.131 

326752 CH50_hs gi|5867615 0.131 

303977 AW512978 EST singleton (not in UniGene) with exon hit 0.131 

301373 AA595235 EST cluster (not in UniGene) with exon hit 0.131 

50 338448 CH22_EM:ACOQ5500.GENSCAN.359-22 0.131 

333774 CH22_FGENES272_5 0.131 

332886 CH22.FGENES.54_8 0-131 

335362 CH22_FGENES.541J2 fc 0.131 

335896 CH22_FGENES.635_4 0.131 

55 337825 CH22_EMAC005500.GENSCAN.13-19 0.131 

325257 CH.11_.hs gI15866895 0.131 

331188 T50240 Hs.167837 ESTs 0.131 

330645 Y08302 Hs. 144879 dual spedflcty phosphatase 9 0.131 

331760 AA292721 Hs.154434 ESTs; Weakly similar to unknown [H.sapiens] 0.131 

60 322995 AA513829 Hs.29797 ribosomal protein L10 0.131 

335497 CH22_FGENES.571_5 0.131 

334824 CH22 FGENES.437 6 0.131 

319480 R06933 Hs.184221 ESTs 0.131 

334842 CH2£J=GENES.439_21 0.131 

65 333335 CH22 FGENES.139J 0.131 

317252 AA905178 Hs.130124 ESTs 0.131 

329034 CHXJlsgi|5868561 0.131 

305188 AA664230 EST singleton (not in UniGene) with exon hit 0.131 

335755 CH22_FGENES.604_4 0.131 
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302143 


H15270 


Hs.189847 putative neuronal cell adhesion molecule 


0.131 


334939 




CH22_FGENES.465 3 


0.131 


318394 


C15110 


Hs.17802 ESTs 


0.131 


334498 




CH22.FGENES.397 14 


0.131 


333413 




CH22.FGENES.146J 


0.132 


329876 




CH.14_p2gl[6272128 


0.132 


327277 




CH.01_hsgi|5867473 


0.132 


305022 


AA627416 


EST singleton (not in UniGene) with exon hit 


0.132 


336805 




CH22J=GENES.196* 


0.132 


320121 


T93657 


EST cluster (not in UniGene) 


0.132 


334761 




CH22_FGENES.428 10 


0.132 


339400 




CH22_BA232E17.GENSCAN.7-6 


0.132 


330301 




CH.06_p2gi|2905862 


0.132 


316822 


AA827691 


Hs.1 29967 ESTs; Weakly similar to neuronal thread protein 








AD7c-NTP[H.sapiens] 


0.132 


328020 




CH.06_hsgil5902482 


0.132 


325327 




CH.11Jisgi|5866875 


0.132 


321163 


AA209530 


EST cluster (not in UniGene) 


0.132 


336393 




CH22_FGENES.823_5 


0.132 


325905 




CH.16_hsgi|5867104 


0.132 


305237 


AA676286 


Hs.21 86 eukaryotic translation elongation factor 1 gamma 


0.132 


339046 




CH22_DA59H18.GENSCAN.28-6 


0.132 


325375 




CH.12Jisgi|5866920 


0.132 ~ 


333961 




CH2a_FGENES.304 7 


0.132 


335450 




CH22_FGENES.562_8 


0.133 


302266 


R58438 


EST cluster (not in UniGene) with exon hit 


0.133 


335116 




CH22_.FGENES.496 3 


0.133 


327333 




CH.01_hsgi|5902477 


0.133 


308070 


AI470948 


EST singleton (not in UniGene) with exon hit 


0.133 


308311 


AI581855 


EST singleton (not in UniGene) with exon hit 


0.133 


320813 


AW3S0847 


Hs^08839 ESTs 


0.133 


323665 


AW248307 


EST cluster (not in UniGene) 


0.133 


328318 




CH.07_hsgl|5868373 


0.133 


320603 


R51419 


EST cluster (not In UniGene) 


0.133 


332791 




CH22.FGENES.3J 


0.133 


314976 


AA524725 


Hs.162108 ESTs 


0.133 


303309 


AL134164 


Hs.224868 ESTs 


0.133 


320581 


R39753 


Hs.170187 ESTs 


0.133 


333944 




CH22J=GENES.3G2_2 


0.133 


317992 


AI733512 


Hs. 130901 ESTs 


0.133 


330935 


F02383 


Hs-26492 beta-1 ;3-glucuronyltransferase 3 (glucuronosyltransferase I) 


0.133 


336659 




CH22_FGENES.36-5 


0.133 


338887 




CH22LDJ32I10.GENSCAN.6-10 


0.133 


305273 


AA679979 


Hs.1 81 165 eukaryotic translation elongation factor 1 alpha 1 


0.133 


333566 




CH22_FGENES.183_2 


0.134 


316952 


AW450033 


Hs.163312 ESTs 


0.134 


333818 




CH22._FGENES.283J 


0.134 


328687 




CH.07Jisgi|5868262 


0.134 


302879 


HI 1802 


EST cluster (not in UniGene) with exon hit 


0.134 


336557 




CH22 FGENES.842_2 


0.134 


335222 




CHa^jFGENES^IS.S 


0.134 


338094 




CH22_EMAC005500.GENSCAN.179-3 


0,134 


337384 




CH22_FGENES.745-1 


0.134 


327360 




CH.01 hs gi]6552411 


0.134 


328132 




CH.06_hsgi|5868038 


0.134 


323604 


AI751438 


Hs.182827 ESTs; Weakly simitar to !!!! ALU SUBFAMILY SQ 








WARNING ENTRY llli 


0.134 


337591 




CH22_C20H12.GENSCAN.6-6 


0.134 


307018 


AI140639 


EST singleton (not in UniGene) with exon hit 


0.134 


326896 




CH21_hsgil5867680 


0.134 


333479 




CH22_FGENES.163__5 


0.134 


337915 




CH22.EMAC005500.GENSCAN.61-3 


0.134 


335110 




CH22 FGENES.494J8 


0.134 


333481 




CH22 FGENES.163_9 


0.134 


327512 




CH.02Jisgl|61 17815 


0.134 


300098 


AW328639 


Hs.83575 ESTs; Weakly simitar to ZC328 J [C.elegans] 


0.134 


330163 




CH.02_p2gi|6042042 


0.135 


335752 




CH22_FGENES.604J 


0.135 


334857 




CH22J=GENES.443_1 


0.135 
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301872 HS4730 EST cluster (not In UniGene) with exon hil 0.135 

337529 CH22_FGENES.823.29 0.135 

335734 CH22_FGENES.601_4 0.135 

337551 CH22_FGENES.647-8 0.135 

5 309078 AI920965 Hs.77961 major histocompatibility complex; class I; B 0.135 

335513 CH22_FGENES.571J8 0.135 

339078 CH22_0A59H18.GENSCAN.37-6 0.135 

321907 N56660 Hs. 148722 ESTs; Weakly similar to large tumor suppressor 1 [H. sapiens] 0.135 

337189 CH22_FGENES.571-32 0.135 

10 329635 CH.12_p2gi|5302817 0.135 

308601 AI719930 EST singleton (ntf In UniGene) with exon hit 0.135 

305020 AA627248 Hs.2064 vimentin 0.135 

333894 CH22_FGENES.293J 0.135 

322465 AA137152 Hs.3784 ESTs; Highly similar to phosphoserine aminotransferase 

IS [H.sapiens] 0.135 

305601 AA780975 EST single ton (not In UniGene) with exon hit 0.135 

332186 H10781 Hs. 141 051 ESTs; Moderately similar to 1!U ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

327822 CH.05_.hs gi|5867968 0.135 

20 310087 AI393914 Hs.160624 ESTs; Weakly similar to similar to CR1 6; SH3 domain 

binding protein 0.135 

328752 CH.07__hsgi|5868298 0.135 . 

337611 CH22_C20H12.GENSCAN.19-4 0.135 

334470 CH22.FGENES.394J 0.136 

25 335115 CH22_FGENES.496.J2 0.136 

328730 CH.07_hs gi|5868289 0.136 

330350 CH.09_p2 gi|3056622 0.138 

336971 CH22_FGENES.378.6 0.136 

308258 AI565612 EST singleton (not in UniGene) with exon hit 0.136 

30 326745 CH.20_hsgi|5887611 0.136 

335440 CH22_FGENES.560_3 0.136 

320257 AA330746 EST cluster (not In UniGene) 0.136 

328677 CH.07JS gi|5868256 0.136 

329731 CH.14_p2 gt|6065783 0.136 

35 315950 AA700553 Hs.206974 ESTs 0.136 

330049 CH.17_p2 gl|4567182 0.138 

337070 CH22_FGENES.448-3 0.136 

304095 H11324 Hs.31059 EST 0.136 

309304 AW0Q5527 Hs.232820 EST 0.136 

40 333458 CH2a_FGENES.157._7 0.136 

329899 CH.15_p2gi|65635Q5 0.136 

322202 AI275056 Hs.200133 ESTs 0.136 

333991 CH22.FGENES.310J5 0.136 

318617 AW247252 Hs.75514 nucleoside phosphorytase 0.136 

45 310623 AI341586 Hs.195588 ESTs 0.138 

330489 M23323 Hs.3003 CD3E antigen; epsiton polypeptide (TiT3 complex) 0.136 

309646 AW194694 EST singleton (not In UniGene) with exon hit 0.136 

331068 R00071 Hs.191199 ESTs 0.136 

334285 CH22_FGENES.369.15 0.136 

50 332178 F13689 Hs.100725 EST 0.138 

305724 AA827608 EST singleton (not In UniGene) with exon hit 0.136 

303158 AL138110 Hs.8594 Homo sapiens mRNA containing (CAG)4 repeat; clone CZ-CAG-7 0.136 

334543 CH22_FGENES.403_8 - 0.136 

335384 CH22_FGENESf43_26 0.136 

55 336527 CH22_FGENES.839_8 0.136 

334951 CH22 FGENES.465_20 0,136 

325882 CH.16_hsgt|5867087 0.137 

305134 AA653159 EST singleton (not in UniGene) with exon hit 0.137 

307058 AI148709 EST singleton (not In UniGene) with exon hit 0.137 

60 331943 AA453418 Hs.178272 ESTs 0.137 

331116 R44780 Hs.22634 ESTs 0.137 

306094 AA908877 EST singleton (not In UniGene) with exon hit 0.137 

333561 CH22 FGENES.180J8 0.137 

321439 H61962 EST cluster (not in UniGene) 0.137 

65 324594 AA497090 EST cluster (not in UniGene) 0.137 

337926 CH22_EM^C005500.GENSCAN.77-4 0.137 

337353 CH22 FGENES.726-1 0.137 

331836 AA412295 Hs.104774 EST 0.137 

308981 AI873242 EST singleton (not in UniGene) with exon hit 0.137 
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329424 CH.YJis gl|5868879 0.137 

325829 CH.15_hs gl]5867052 0.137 

331845 AA416883 Hs.98183 ESTs 0.137 

333854 CH22.FGENES.290J3 0.137 

5 306591 AI000248 EST singleton (not in UniGene) with exon hit 0.137 

328948 CH.08_hs gi|6456765 0.137 

338935 CH2^DJ32llO.QENSCAN.18-12 0.137 

325980 CH.16_hs gi|5887147 0.137 

328377 CH.07_hsgl|5868390 0.138 

10 308851 AI629820 EST singleton (not in UniGene) with exon hit 0.138 

314620 AA424352 H&210586 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.6-7 0.138 

338684 CH22_EMAOXB500.GENSCAN.4723 0.138 

331800 AA400498 Hs.97543 ESTs 0.138 

IS 304587 AA505535 EST singleton (not in UniGene) with exon hit 0.138 

333981 CH22_FGENES.310_4 0.138 

332452 AA040369 Hs.11170 SYT interacting protein 0.138 

305752 AA835278 EST singleton (not in UniGene) with exon hit 0.138 

311947 T65554 HsJ>51591 EST 0.138 

20 333783 CH22_FGENES.273_5 0.138 

337406 CH22_FGENES.754*14 0.138 

327976 CH.06Jisgi|5868212 0.138 . 

325593 CH.13_nsgi|5866992 0.138 " 

339425 .CH22_DJ579N16.GENSCAN.144 0.138 

25 304475 AA428B79 EST singleton (not In UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not In UniGene) with exon hit 0.138 

337532 CH22_FGENES.827-B 0.138 

317234 AA904448 Hs,120368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08 hsgi|5868500 0.138 

336424 CH22_FGENES.824_9 0.138 

326667 CH20_hsg!|6552455 0.138 

325988 CH.16_hsgi|5867064 0.138 

318446 AW300287 EST duster (not in UniGene) 0.139 

35 33651 1 CH22_FGENES.834_6 0.139 

335204 CH22.FGENES.508J3 0.139 

303244 AA1 47472 EST cluster (not In UniGene) with exon hit 0.139 

330870 AA1 15804 Hs.187593 ESTs 0.139 

329376 CHX.hsgi|5868859 0.139 

40 304703 AA563898 EST singleton (not in UniGene) with exon hit 0.139 

333653 CH22_FGENES.239 - 2 0,139 

306799 A1051698 EST singleton (not in UniGene) with exon hit 0.139 

304872 AA595289 EST singleton (not in UniGene) with exon hit 0.139 

330812 AA013001 Hs.60563 ESTs 0.139 

45 329568 CH.10_p2 gl[3962490 0.139 

319210 AA253074 Hs.146261 ESTs 0.139 

334320 CH22_FGENES.374_5 0.139 

300860 AI916949 Hs.149748 ESTs; WeaWy similar to weak similarity to collagens [Celegans] 0.139 

305666 AA864533 EST singleton (not in UniGene) with exon hit 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs.83958 transducin-llke enhancer of split 4; homotog of Drosophiia E(sp1) 0.139 

312708 AI076204 Hs.135440 ESTs 0.139 

309366 AW072970 EST singleton (not in UniGene) with exon hit - 0.139 

303273 AA316069 EST cluster (not in UniGene) with exon hit 0.139 

55 317484 AW274696 Hs.143921 ESTs 0.139 

333239 CH22_FGENES.111_1 0.139 

307126 AI184951 EST singleton (not in UniGene) with exon hit 0.1 39 

316813 AA826505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs;WeaWy similar to KIAA0386 [H.saplens] 0.139 

60 308558 AI700145 Hs.172182 poty(A)-binding protein; cytoplasmic 1 0.139 

310784 AW086142 Hs.159017 ESTs 0.139 

323831 AA335715 Hs.200299 ESTs 0.139 

307692 A1318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 A1318327 EST cluster (not in UniGene) 0.139 

65 327934 CH.06_hsgl|586B184 0.139 

305232 AA670052 Hs.195188 glyceraidehyde^hosphate dehydrogenase 0.139 

334756 CH22 FGENES.428_5 0.139 

331938 AA451867 Hs.99255 ESTs 0.139 

301393 AI474722 Hs.150898 ESTs; Weakly similar to KIAA0644 protein [H.sapiens] 0.139 
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312005 T78450 Hs.13941 ESTs 0.139 

338431 CH22_EM:ACOQ5500.GENSCAN.35M 0.14 

331214 T90496 Hs.16757 ESTs 0.14 

333601 CH22.FGENES.213J 0.14 

5 323481 AA278449 Hs.137429 ESTs 0.14 

336911 CH22.FGENES.3444 0.14 

338157 CH22_EM:AC005500.GENSCAN.209-5 0,14 

327845 CH.05JUS gi|6531962 0.14 

319109 Z45662 Hs.90797 Homo sapiens done 23620 mRNA sequence 0.14 

10 334763 CH22.FGENES.428J2 0.14 

329384 CHJLhsgi|5868869 0.14 

302996 AF054663 EST cluster (not In UnlGene) with exon hit 0.14 

323751 AW452656 Hs.209824 ESTs 0.14 

329916 CH,16_p2gi|6223624 0.14 

15 301993 N49826 Hs.18602 ESTs 0.14 

338129 CH2^EMAC005500.QENSCAN.197-2 0.14 

325704 CH.14J1S gl|5867028 0.14 

335656 CH22.FGENES.590J 0.14 

331673 W72366 Hs.40033 ESTs 0.14 

20 316807 AI018331 Hs.1 72444 ESTs; Highly similar to transcription regulator [M.musculus] 0.14 

310743 AW449754 Ha.158665 ESTs 0.14 

326941 CH.21_hsgi|6004446 0.14 

328809 CH.07 Jis gi|5868327 0.14 

323855 AI653164 Hs.128665 ESTs 0.14 

25 304705 AA564064 EST singleton (not to UnlGene) with exon hit 0.14 

325666 CH.14_hs gl|6469822 0.14 

333747 CH22_FGENES.265_6 0.14 

318287 AW015616 Hs.143321 ESTs 0.141 

332972 CH22.FGENES.51_5 0.141 

30 305704 AA825266 EST singleton (not in UnlGene) with exon htt 0.141 

315699 AW182805 Hs.189183 ESTs; Weakly similar to Nodi [H.sapiens] 0.141 

327296 CH.OIJis gi[5867492 0.141 

336400 CH22j r GENES.823_15 0.141 

321033 H26214 Hs.20733 ESTs; Weakly similar to HI! ALU SUBFAMILY SX 

35 WARNING ENTRY 0.141 

316522 AI475995 Hs.122910 ESTs 0.141 

335715 CH22J=Ge€S.599j5 0.141 

335959 CH22_FGENES.650J 0.141 

333259 CH22_FGENES.118_7 0.141 

40 337382 CH22J : GBiES.744-8 0.141 

322346 AA227618 Hs.10882 HMG-box containing protein 1 0.141 

325378 CH.12_hsgi|5866920 0.141 

338500 CH22_EMAC005500.GENSCAN^90-1 0.141 

338460 CH22„EM^C005500.GENSCAN562^ 0.141 

45 315279 AW511138 Hs.256581 ESTs 0.141 

314439 AI539443 Hs.137447 ESTs 0.141 

333624 CH22_FGENES.222_3 0.141 

329237 CHX_hs gl|5868729 0.141 

330117 CH.19_p2gi|6015201 0.141 

50 338017 CH22.EMAC005500.GENSCAN.134-1 0.141 

337854 CH22_EM:AC005500.GENSCAN.38-12 0.142 

329984 CH.16_p2 gi|4646193 0.142 

305004 AA622328 Hs.1 62762 EST - 0.142 

302815 N40373 EST cluster (not In UnlGene) with exon hit 0.142 

55 327823 CH.05Jisgi|5887968 0.142 

326753 CH^OJs gi[5867616 0.142 

301201 AA904482 Hs.197775 ESTs 0.142 

334303 CH22_FGENES.373J 0.142 

326453 CH.19jsgil5867399 0.142 

60 311050 AI864581 Hs.215477 ESTs 0.142 

308740 AI802711 Hs.210337 EST; Weakly similar to aldolase A [H.sapiens] 0.142 

331003 H63959 Hs.142722 ESTs 0.142 

338010 CH22_EMAC005500.GENSCAN.128-8 0.142 

336326 CH22_FGENES.812_4 0.142 

65 318100 R44308 Hs.242302 ESTs 0.142 

320641 R55421 EST cluster (not In UnlGene) 0.142 

325855 CH.16Jisgl|5867067 0.142 

330425 HG1728-HT1734 Non-Specific Cross Reacting Antigen (Gb:D90277), 

Alt Splice Form 2 0.142 
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324683 AM25411 Hs.22581 ESTs 

326268 CH.17_hsgf|5887267 

331390 AA460341 Hs.45008 ESTs 

338904 CH22JW32M0.GENSCAN.10-16 

333098 CH22L.FGENES.79_1 

331919 AA446869 Hs.119316 ESTs 

312214 AI248004 Hs.125187 ESTs 

323198 AW179174 Hs.7984 ESTs 

316107 AI204001 Hs. 184014 ribosomal protein L31 

301335 AA885317 Hs.190511 ESTs 

337392 CH22.FGENES.747-3 

325543 CH.12_hsgl|6682452 

305903 AA873085 EST singleton (not In UnlGene) with exon hit 

332707 L35594 Hs.174185 phosphodiesterase l/nucieotide pyrophosphatase 2 (autotaxin) 

337913 CH22_EM^C005500.GENSCAN.59-10 

301436 AA961061 Hs.131696 ESTs 

335078 CH22_FGENES.486_5 

338451 C^EM:AC005500.GENSCAN.359-a9 

302777 AJ230640 EST cluster (not in UniGene) with exon hit 

330464 J0306B Hs.78223 N-acytamtnoacyl-peptide hydrolase 

330988 H4H11 Hs.33855 ESTs 

328939 CH.08_hsgl|6004481 

308015 AI440174 Hs.228907 EST; Weakly similar to GUANINE NUCLEOTIDE-BINDING 
PROTEiN BETASU8UNrT-UKE PROTEIN 
12.3[H.sapiens] 

328504 CH.07_hsgi|5868471 

332599 AA402891 Hs.32951 solute carrier family 29 (nucleoside transporters); member 2 
335744 CH22_FGENES.601_15 
322394 AF077208 EST cluster (not in UniGene) 

323892 AL042661 EST cluster (not in UniGene) 

318443 AI939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA-5 CHAIN PRECURSOR 
[H.sapiens] 

CH2^FGENES.843_7 
H08815 Hs.159824 EST 
327672 CH.04Jisgi|5867843 
335900 CH22_FGENES.635_8 
336044 CH22_FGB4ES.679_6 

318845 AI815951 Hs.33183 ESTs; Weakly similar to estrogen-responsive finger protein; 
efp [H.sapieas] 

CH22JK3ENES.165J 
CH22.FGENES.139J 
EST singleton (not In UniGene) with exon hit 
CH22_FGENES.599_22 
CH.14_hsgi|6138923 
CH.01_hsgi|6249563 
CH22 BA354I12.GENSCAN.18-1 
CH.18~lhsgi[58672fl3 
CH.08_p2gil6007576 
Hs.174131 ribosomal protein L6 

CH22.EMACO05500.GENSCAN.164-1 
CH22_DA59H18.6ENSCAN.18-7 
CH.05_hsgi|5867964 
CH22_FGENES.41-8 
EST cluster (not in UniGene) 
Hs.12024 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.250178 copinatl 

CH.02.hsgi|6017023 
CH22 FGENES.513J3 
CH22_DA59H18.GENSCAN.22-1 
Hs.33026 ESTs; Weakly similar to similar to Enterococcus faecatis 
TRAB [Celegans] 
308550 AI697008 Hs.201811 EST 

302175 AA282760 Hs.156015 Homo sapiens chromosome 19; cosmtdR29381 
303252 AA156760 EST cluster (not in UniGene) with exon hit 

337414 CH22„FGENES.757-2 
310382 AI734009 EST cluster (not in UniGene) 

329333 CH.X_hsgi|58688G6 
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305993 AA889197 
335719 
325682 
327350 
339291 
326358 
330316 
308150 
338065 
339009 
327776 
336664 
321921 
319346 
304265 
303818 
327498 
335227 
339022 
302597 



AI499346 
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338857 


CH22_FGENES.2917 


0.145 


332565 AA234896 


Hs.25272 E1A binding protein p300 


0.145 


318834 AI928098 


Hs.156832 ESTs 


0.145 


336318 


CH22_FGENES.801 1 


0.145 


310960 AI923551 


Hs.170843 ESTs 


0.145 


335346 


CH22_FGENES.537J2 


0.145 


331198 T65416 


Hs.12826 ESTs 


0.145 


337607 


CH22jC20H12.GENSCAN.17-3 


0.146 


331206 T84096 


Hs.15284 ESTs 


0.146 


301793 T80698 


EST cluster (not In UniGene) wtth exon htt 


0.146 


319590 AA210878 


EST cluster (not in UniGene) 


0.146 


311394 AI695374 


Hs.256231 ESTs 


0,146 


324773 AA632554 


Hs.163401 ESTs 


0.146 


324841 AI142359 


Hs.155316 ESTs 


0.146 


332260 N70088 


Hs.138467 ESTs 


0.146 


329276 


CH.XJisgi|5868762 


0.146 


3358S7 


CH22_FGENES.633J 


0.146 


338294 


CH22^EM:AC005500.GENSCAN297-1 


0.146 


336993 v 


CH22_FGENES.409-4 


0.146 


334135 


CH22.FGENES.336J 


0.146 


326251 


Cai7_hsgi|5867263 


0.146 


337396 


CH22_FGENES.749-1 


0.146 


339167 


CH22_DA59H18.GENSCAN.69-8 


0.146 


316838 AW135418 


Hs.161210 ESTs 


0.148 


325313 


CH.11 hsgi|5866865 


0.146 


331047 N66918 


Hs.32205 ESTs 


0.146 


323915 AL043362 


EST cluster (not tn UniGene) 


0.146 


302747 AF062275 


EST duster (not in UniGene) with exon hit 


0.146 


306317 AA947909 


EST singleton (not in UniGene) with exon hit 


0.146 


334399 


CH22 FGENES.382_5 


0.146 


326472 


CH.19_hs #867404 


0.146 


333061 


CH22_FGENES.75 4 


0.146 


337072 


CH22 FGENES.448-5 


0.146 


334328 


CH22 FGENES.375 5 


0.146 


327039 


CH.2lLhsgi|653l965 


0.146 


325576 


CH.12_hsgi|6552443 


0.147 


315935 AI075804 


Hs. 132660 ESTs 


0.147 


319638 AA323758 


EST duster (not In UniGene) 


0.147 


334501 


CH22 FGENES.397 17 


0.147 


338238 


CH22^EM^C005500.GENSCANJ264-4 


0.147 


308638 AI744063 


EST singleton (not in UniGene) with exon hit 


0.147 


336567 


CH22_.FGENES.843 6 


0.147 


335819 


Cr£2_FGBJES.819_j2 


0.147 


336950 


CH2SLFGENES.361-8 


0.147 


307055 AI148477 


EST singleton (not in UniGene) with exon hit 


0.147 


315134 AW504854 


Hs.126714 ESTs 


0.147 


335834 


CH22_FGENES.621J 


0.147 


327870 


CH.06_hsgi|5868131 


0.147 


323802 AA332011 


Hs.250138 protein phosphatase 2C; magnesium-dependent; catalytic subunit 


0.147 


329412 


CH.XJisgi|6682553 


0.147 


323791 AA333068 


EST cluster (not In UniGene) 


0.147 


324126 AA385315 


EST cluster (not In UniGene) 


0.147 


327865 


CH.0B_hsgi|5868130 


0.147 


333445 


CH22 FGENES.154J! 


0.147 


321302 AA021351 


Hs.158497 KIAA0724 gene product 


0.147 


336744 


CH22_FGENES.118-9 


0,147 


323731 AA323414 


EST cluster (not in UniGene) 


0.148 


320289 H07989 


EST cluster (not in UniGene) 


0.148 


305488 AA749000 


EST singleton (not in UniGene) wtth exon hit 


0.148 


305592 AA780594 


Hs.62954 ferritin; heavy polypeptide 1 


0.148 


304094 H11295 


EST singleton (not in UniGene) wtth exon hit 


0,148 


325040 AW296368 


EST duster (not in UniGene) 


0.148 


339034 


CH22L_DA59H18.GENSCAN.26-2 


0.148 


334504 


CH22 FGENES.396_2 


0.148 


334778 


CH2-LFGENES.431J 


0.148 


320148 U77494 


Hs. 119667 RAN binding protein 8 


0.148 


303584 AW173759 


Hs.203401 ESTs 


0.148 


325826 


CH.15Jlsgi|5887048 


0.148 


331192 T55182 


Hs.1 52571 ESTs; Highly similar to IGF-II mRNA-btnding protein 2 [H^aplensl 


0.148 
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325785 CH.10sgi[6381957 0.148 

333166 CH2£_FGENES.91_8 0.148 

336548 CH22_FGENES.841_5 0.148 

337552 CH22_C4G1.GENSCAN.1-4 0.148 

5 331775 AA382742 Hs.97151 EST 0.148 

338936 CH22JU32I10.GENSCAN.19-6 0.148 

331869 AA428554 Hs.104894 ESTs; Weakly similar to fibronectin precursor [Haptens] 0.148 

332865 CH22_.FGENES.28_5 0.148 

328663 CH.07J1S gi|6004473 0.148 

10 328436 CH.07J1S gl|5868417 0.148 

311158 AI634864 Hs.250789 ESTs; Highly similar to similar to NEDD-4[H. sapiens] 0.148 

336942 CH22_FGENES.354-2 0.148 

302262 R53169 Hs.246091 ESTs 0.149 

333296 CH22_FGENES.132_3 0.149 

15 333365 CH22L.FGENES.142J 0.149 

311706 AW452392 Hs.252854 ESTs 0.149 

337109 CH22.FGENES.489-2 0.149 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22.FGENES.157J 0.149 

20 334784 CH22_FGENES.432J 0.149 

333255 CH22_FGENES.118J 0.149 

337518 CH22_FGENES.81*7 0.149 . 

320651 AA489268 EST cluster (not in UniGene) 0.149 

323437 AA287567 EST cluster (not In UniGene) 0.149 

25 328761 CH.07_hsgij5868302 0.149 

328787 CH.07_hsgf[5868309 

335261 CH22_.FGENES.520J 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354I12.GENSCAN.10-1 0.149 

30 337412 CH22.FGENES.756-6 0.149 

334414 CH22_FGENES.384J 0.149 

332931 CH22_FGENES.38_5 0.149 

310801 AW270980 Hs.106346 novel centrosomal protein RanBPM 0.149 

305216 AA669056 EST singleton (not In UniGene) with axon hit 0.149 

35 314779 AA470122 Hs.190261 ESTs 0.149 

338414 CH22_EM:ACCO5500.GENSCAN.341-27 0.149 

303342 AW247361 EST duster (not in UniGene) with exon hit 0.149 

337509 CH22_FGENES.8064 0.149 

306631 AI001149 EST singleton (not In UniGene) with exon hit 0.149 

40 302533 L36149 Hs.2481 16 chemokine (C motif) XC receptor 1 0.149 

336536 CH22.FGENES.839J8 0.149 

324666 T32458 Hs.14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22.FGENES.21 1 J 0.149 

45 335975 CH22_FGENES.652J 0.15 

306654 AI003654 EST singleton (not In UniGene) with exon hit 0.15 

335025 CH22.FGENES.475J 0.15 

32871 1 CH.07 Jts gip868271 0.15 

328274 CH.07Jisoj|5868219 0.15 

50 325505 CH.12_hsgl|6682451 0.15 

329641 CH.14_p2gi|6468233 0.15 

304955 AA613504 EST singleton (not in UniGene) with exon hit 0.15 

339103 CH22 DA59H18.GENSCAN.44-10 * 0.15 

329636 CH.12J>2 Qi]5302817 0.15 

55 310118 AI203293 Hs.157489 ESTs 0.15 

326056 CH.17_hs gi]5867184 0.15 

303773 AA769074 EST cluster (not In UniGene) with exon hit 0.15 

303153 U09759 Hs.8325 mitogen-activated protein kinase 9 0.15 
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TABLE 13A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 13. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322050 24275.1 
321439 1599424J 
321666 13653.22 



300088 622937 J 
322303 704603J 
322394 2749^.1 



321758 44275J 
323109 155498J 



322533 38937.1 
321921 34680J 
321927 21620.1 



321932 265316J 
306971 14694.7 



AL1 37589 AA423949 BE222949 BE222694 AI199615 AW8731 16 AI277950 AWO44290 AW630096 
H61962W01567 N75711 

BE259906 AA232518 AA013359 AL035788 AW 160822 BE387134 BE002954 BE391839 AW161565 AI878841 BE616458 
BE409981 BE387308 BE297436 BE315536 AA206924 R12012 AA214169 BE312812 BE337093 H11710 BE312009 
BE260569 AA343588 AA219526 R34757 AA219749 BE336733 AA219751 AW411099 AA232408 BE018716 BE398089 
AA206253 AA053487 AA1 14224 AV655868 AW732566 BE394087 AW732574 AA313442 BE336875 AA070548 BE259840 
BE019828 AW732341 AA299916 BE019253 BE018238 BE387109 AA232304 BE255589 AW732585 M181436 AA308777 
AA075802 AW732521 AA314526 AA226747 BE409513 AA206168 BE388292 BE298782 BE387086 AA305310 AV652723 
AA314918 BE615510 AW951763 BE398104 BE385195 BE407165 BE391336 BE390187 BE389189 BE54O650 BE249884 
BE385985 BE274245 BE391124 BE260080 AA182600 BE512821 BE390090 BE279398 BE279589 BE263454 BE515194 
BE293569 BE272531 BE388814 BE384659 BE271685 BE561043 BE278449 BE302572 AW239076 AI750583 AA376179 
AA1 12632 BE266324 BE266614 R13105 AA132286 BE296305 AI220355 AA205606 AA219527 AA219519 AW804310 
AA083286 BE171208 T19693 AA338328 BE185868 AA903024 T92162 AA3301 19 BE410404 BE314668 
AW576245 BE207878 AW299993 AI199558 A1285442 AW299994 AW394242 AW394184 
A1357412 AI870708 AI590539 W07459 

AW068287 AA310079 BE338702 AA356318 AA306059 AA348785 AW402633 AA31 1210 AW402909 N76879 AW402913 
AW401 920 AA321636 AA354474 C17297 C1 6938 AA31 1774 M29871 NMJ0O2872 2821 88 AW405674 H94176 R89281 
AA214723 AI014482 AW949347 T27749 AW804226 AW796964 AW404581 AF077208 NM.014Q29 W68830 W79652 
AA353375 AW575218 AA552192 AA521232 M702695 AA033975 AW407827 AA829948 N94402 AW628604 A1523308 
N57605 AA641662 H42477 N52784 A1753478 AA768493 AA845729 W47391 N55270 A10901 17 R89282 BE206172 
AA076650AA595650AI218931 BE049397AI433110W74114H94277AI358827 A1085221 AI862818 AA835967 AW103905 
AI640644 AA835507 AA856887 AA694392 AW337542 AI524410 BE045500 A1440060 AI358801 AW028238 AW205248 
AI718264 R48618 AA357358 AI695002 AA897549 AW081065 AI433360 A1810783 A162Q963 Z82188 AA360224 
U291 12 AI656540 AI364875 AI656246 AI99094O 

AA169345 AI762857 AI949997 AI809601 AI681948 AI221079 AW167404 AI347614 A161 1090 AI023472 AI347683 A1027467 
AW591788 A1380665 AA835735 AA836654 AI244G28 AW193159 AI5001 12 AI918722 AJ738693 A17023Q8 AA805365 
AI766842 

T59538 T59589 T59598 T59542 AF147374 
AF070619 R20302 T80358 

AJ223366 BE305086 AW820106 AA621983 BE3O5208 AI738475 AI380189 AW590847 A] 127232 AA622706 AI380858 
AA621975 AI587036 AA665743 AW204003 AI692234 AI002242.A1692219 AW137282 AW268783 AW295910 AI308015 
AW301482 A1318288AI318575 AI316117 AI345591 AI249650AI246934 AI246864 A1246971 AW268311 AI249654 BE041907 
AW732776 

N72324 N52825 W19526 BE143464 AA376060 

M83667 NM.005195 S63168 M83667 AW068039 AW630649 A1338577 AI018125 AI269878 AW242440 A1887823 A1342581 
BE222416 AI582847AI651011 A1660815 AI699574 BE550201 AI926996 AW665855 AI827752 AI761857 BE328168 
BE222451 AI762201 AW000929 AW007207 BE042962 BE551843 BE465373 AI279179 AI949945 BE551862 AW051667 
BE328076 BE222298 AW007229 AW772332 AI279801 AI934526 AI631938 AI770103 BE041412 AI417900 AI692655 
AI869943 AW2701 19 AI431739 AI703347 AW770568 AW025473 AI701497 Al 128026 BE328147 AW203980 BE046793 
AW087704 AI674597 A1650732 AI813691 AI472092 AI695224 AK41217 AW207746 AI206840 AI271362 AI631788 A1911883 
AI914619 AI380585 AI767501 AI823759 AI5641 16 AI190991 AJ377369 AI814122 AI221623 Ai354793 AI081988 AI391740 
AI337435 BE467366 AI824347 AI565325 AI280038 AI640455 A1819744 BE467803 BE327524 AI149402 AI313187 BE219684 
AW611948 AW665821 AI09126O AW044492 BE220366 AW025381 AW1 83264 AI694865 AI498474 AI129780 AI202028 
A1566792 BE220659 AI928040 AI830696 AI493021 AW612488 A1913152 BE042965 AI631837 A1693873 A1498925 AI768668 
AM01544 BE327023 AI693383 A1769874 AI744003 AW082273 AI688501 AI798177 AI985196 AI090033 AI432342 AI689918 
AI638308 BE468080 BE219588 AI9121 19 BE219787 AW005392 BE326564 A1589039 AI860187 AI758143 AJ338168 
A1702938 BE221885 AI498727 AI918196 AI279735 AW771497 A1860133 AW237834 AW661759 AW0281 1 1 BE503416 
AI360180 AW61 1715 AI871777 BE045447 BE326444 AI266547 AI800237 AI823315 AW78368 AI264281 AI675841 AI690041 
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301119 33384J 



324019 262792J 

323437 189513J 

307845 19B04J0 

324126 272259.1 

309101 7570J 



45 315703 119175.1 



301373 368214J 

323665 54093.1 

323676 220254J 

302066 23306.1 

323731 226193.1 

323791 232336.1 

325040 23854.1 

324430 312113J 

323892 477253.1 

309468 1030131.1 

302251 27216.4 

302286 22717.6 

323915 110063.1 

324594 330528.1 

301737 65.1 



A1498018 AI554124 AI239893 AI864Q54 AI280099 AI192815 AI620465 AI080201 AW002057 BE500988 AI341131 A1818991 
AI566137 AI123403 BE219192 AW183844 A1499842 AW137971 AW138720 AW015526 AW138160 AW243163 AW138705 
AW139927 AW140006 AW138810 AW137450 AW206970 AW135419 AW205974 AA043494 BE465106 AW139955 AI741 112 
BE326942 AA043506 AI076957 AI942432 AI392902 AI097047 AI470599 AA514553 AA984O08 N47949 AI6541 14 AA884832 
AI796752 AI765290 AI301 155 AW470358 BE222764 AI823569 AI651 188 AI692695 A1476643 BE504307 AT767573 BE219719 
AI932249 AW467075 Al 9 13633 BE221966 AI091025 AA969215 AI799810 AA931 170 BE048559 AI809606 A1138614 
AI739456 AI674605 AW772068 AI089286 AI625787 AI263418 AW008638 AI928389 AW628997 AI470010 A1914168 AI760003 
AI203050 AI334069 AI694788 BE045337 AI948659A1912982 AI867131 AI192102 AI767583 AI347518 AI5B6005 AIQ25884 
AI21588B AI633904 AW 182265 AW 6 14357 A1128030 AI343685 AI914263 A19B5003 A1823578 A1493053 A1380285 A1633895 - 
AI267880 AI538162 AI991552 BE219479 BE219296 AJ302178 AW779296 AI913805 AI631644 AI566772 At9B5498 AI942289 
AI935659 AI339092 AI247432 AI686472 AI766886 Al 01 7228 AI333272 AW301668 AI972218 AW082027 At632974 A1474761 
AI766127 AW23657B AW000968 A1870734 AI222399 AI871249 AI703448 BE464210 AI768037 A1871585 AI767B71 AI738757 
A1220732 AI6B1 633 AI763783 AI684463 AI307339 AI263203 AW665264 BE463969 AI768786 AI4391 18 Al 1279 13 8E218324 
A1872342 BE220052 AI796163 A1221662 AW197672 AWQ25300 AI769681 AW612448 BE219757 AW072420 AI669980 
AI830418 AW204353 AA04701 1 AA913868 AI739146 AI669954 AW470507 AW614835 AW3Q2151 AW772372 A1762427 
AW339902 AW303370 BE464775 AW299818 AW236072 AW 195060 AW274737 AW263062 AW183846 AI868894 AW300493 
AW172509 AW516876 AW593773 AW299474 AW303546 A1817323 AI823624 AI694QG5 AI934589 AI343479 AI861825 
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AW957299 AA352608 AA676752 AA410510 AA358874 AI865724 AA853679 A1699265 AW188789 N47380 AA233715 
BE258194 R55421 R55643 H42362 AA243884 

AW886407 AA489268 R57015 R58094 BE077459 BE077423 BB46995 AW849216 T69383 AW9381 1 1 H60337 BE221073 
AB033100 AA347036 BE260325 AW961669 AL047207 AA347037 AI766894 AA601 045 A1559897 AW139033 AW274622 
AW172884 AW089070 AA804340 AW798925 
AA825266 

AL137354AL043375 
AA971985 
AA977992 
AA989542 
AA989598 
AA989713 
AA991487 
A1000246 
A1000248 
AI001149 
AI003654 
AI041589 
AI051696 
AI452732 
AI470948 
AI4769U 
AIQ55966 
AI066577 
AI086929 
AI095385 
AI127883 
AI559492 
AI565612 
308289 AI571211 
308311 AI581855 
308332 AI591235 
308511 AI687580 
308601 AJ7 19930 

308612 AI735634 
308636 AI744063 
308814 AI819263 
308851 AI829820 
308981 AI873242 
310570 1071946J AI318327 AI318328 A1318495 
305022 AA627416 
305060 AA635771 
305070 AA639783 



306051 19085J 



321163 171 122 J 
321235 1102181J 
320603 4297J 



320641 185591J 
320651 58648J 
321325 28266J 

305704 484759_-1 

322011 23158.1 

306407 

306454 

306516 

306518 

306526 

306534 

306590 

306591 

306631 

306654 

306766 

306799 

308023 

308070 



306805 
306814 
306873 
306911 
306982 
308238 
55 308258 
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305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305488 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307126 
305903 



AA641329 

AA653159 

AW512978 

AA669056 

AA679467 

AA679772 

AA721052 

AA723748 

AA749000 

AA773530 

AA780975 

AA7B2319 

AA789095 

AA826544 

AA827608 

AA831819 

AA835278 

AIU0639 

AI148477 

AI148709 

AA845997 

AA857665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

AI184951 

AA873085 



328803 c_7J\s 
328809 c_7_hs 
305949 AA884409 
328829 c_7_hs 
330021 c16_p2 
330024 C16 _p2 
330028 c16j>2 
330049 c17j)2 
305993 AA889197 

330095 Cl9_p2 

330096 c19_p2 

307205 AI192479 
307427 AI243437 
307491 AI268539 
307581 AJ284415 
307588 A1285535 
337672 CH22.6002FCL.UNK.EMAC00 
337693 CH2a„6030FGL.UNK.EM:ACOO 
337738 CH22_6083FCL_UNK_EM^C00 
307692 A1318342 
307806 A1351739 
309107 AI925B23 
309230 AI970747 
339338 CH22_8300FQ„UNK_BA354J1 
309257 AI934183 
309366 AW072970 
309422 AW087175 
325207 c10_hs 
325257 c11 Jis 

309646 AW194694 
309651 AW195850 
325313 c11Jis 

309924 AW340812 
334030 CH22J308FGL320JUINK_EM 
334040 CH22J318FGL322_8_UNK_EM 
334083 CH22_1361FG_327„38_UNieE 
332810 CH22^6FG_7_12JJNK_C65E1 
302747 32813J AF062275 L03830 

302753 33029 J M74299 M74302 M74303 

302777 33803J AJ230640 AJ230648 
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304094 
302824 



325870 
304240 
304410 
304443 
304475 
304522 
304878 
304705 
306004 
306008 
306013 
306082 
336174 
306094 
304823 
304872 
304918 
304955 
306249 



35372J 
41 198 J 
C16JS 



H11285 

U21260 U21258 
AF054663AF124197R70292 



AA0098Q2 
AA284508 
AA399444 
AA42B879 
AA465405 
AA548556 
AA564064 
AA889992 
AA894390 
AA896990 
AA908508 
CH2^3567FGL710_1.UKK_PA 
AA908877 



306295 
306317 
306347 
306365 
306398 
330401 
330463 



330535 
332634 



entre^D28383 
460J 



1374 -8 
10404.2 



AA584837 
AA595289 
AA602697 
AA613504 
AA933840 
AA936892 
AA937331 
AA947909 
AA961144 
AA962086 
AA970548 
D28383 

NM.001055 AA332948 U26309 U09031 L19955 L10819 A1366043 X84654 U71086 AV654451 AJ007418 AA053625 
BE168856 AA376730 H12694 AA810348 AA621972 AI818950 AV645387 AI819966 AA910602 AW512449 H67893 AI310497 
AI304330 AI339217 AW193588 AW438688 AI818970 AW316789 AA906527 AA777570 N47673 AI336428 AW945133 
AIO38606 R29692 AW194197 AI304748 H12639 AA053178 AA493213 AA676958 AA113154 AI313469 AI368239 R93183 
W24532 U52852 U54701 AL046854 AA365795 
U11872 

U24488NM.007116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref; Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLpositlon: Indicates nucleotide positions of predicted exons. 



Pkey Ref 



Strand NLpositlon 



332791 Dunham, I, etal Plus 

332792 Dunham, Utah Plus 
332810 Dunham, I. etal. Plus 
332944 Dunham, I etal. Plus 
332972 Dunham, I. etal. Plus 
333133 Dunham, I. etal. Plus 

333154 Dunham, I. etal. Plus 

333155 Dunham, I. etal. Plus 
333227 Dunham, I. etal. Plus 
333230 Dunham, I. etal. Plus 
333298 Dunham, I. etal. Plus 

333304 Dunham, I. etal. Plus 

333305 Dunham, I. etal. Plus 
333365 Dunham, I. etal. Plus 
333383 Dunham, I. etal. Plus 

333391 Dunham, I. etal. Plus 

333392 Dunham, I. etal. Plus 
333397 Dunham, I. etal Plus 
333403 Dunham, I. etal. Plus 
333413 Dunham, I. etal. Plus 
333445 Dunham, I. etal. Plus 
333479 Dunham, t. etal. Plus 
333481 Dunham, I. etal. Plus 
333483 Dunham, I. etal. Plus 

333516 Dunham, I. etal. Plus 

333517 Dunham, I. etal Plus 

333518 Dunham, I. etal. Plus 
333531 Dunham, I. etal. Plus 
333566 Dunham, I. etal. Plus 
333572 Dunham, I. etal. Plus 
333588 Dunham,!. etal. Plus 
333588 Dunham, I. etal. Plus 

333594 Dunham, I. etal. Plus 

333595 Dunham, I. etal. Plus 

333600 Dunham, I. etal. Plus 

333601 Dunham, I. etal. Plus 
333607 Dunham, I. etal. Plus 
333612 Dunham, I. etal. Plus 
333813 Dunham, I. etal. Plus 
333614 Dunham, I. etal Plus 
333624 Dunham, I. etal Plus 
333626 Dunham, I. etal Plus 
333635 Dunham, I etal Plus 
333637 Dunham, I etal Plus 
333642 Dunham, I etal Plus 
333647 Dunham, I. etal. Plus 

333653 Dunham, I. etal. Plus 

333654 Dunham, I. etal Plus 

333656 Dunham, I etal. Plus 

333657 Dunham, I. etal Plus 

333658 Dunham, I etal. Plus 



72720-73315 

73381-73768 

304296-304384 

2414825-2414932 

2572152-2572236 

3360058-3360195 

3615887-3616019 

3616832-3617003 



3995507-3996507 
4581537-4581947 
46299434630242 
46303884630645 
47868834787283 
49071794907277 
49166974916780 
49182944918433 
49224664922635 
49251404925256 
49438244943974 
5097827-5097885 
5272855*272939 
5286358-5286505 
5297945-5298105 
5570204*570390 
5570729*570925 
5571761-5572025 
5622622-5622684 
5954226*954473 
6026896*027189 
6246834*247314 
6255445*255779 



6323103*323348 
6355629*355925 
6360075*360442 
6504431*504690 
6549563*549697 
6550643*550748 
6551227*551389 
6595146*595244 
6614174*614467 
6663683*663973 
6674968*675134 
6708760*709139 
6772502*772779 
6811130*811392 
6816731*816993 
6822087*822406 
6831369*831445 
6835282*835474 



258 



WO 02/30268 PCT/US01/32045 



333859 Dunham, I. etal. Plus 6B36179-6836248 

3336B4 Dunham, I. etal. Plus 7169531-71 69742 

333686 Dunham, I. etal. Phis 7177117-7177302 

333697 Dunham, I. etal. Phis 7203859-7203934 

5 333698 Dunham, I. etal. Plus 7205279-7205383 

333699 Dunham, I. etal. Plus 7206101-7206175 

333703 Dunham, I. etal. Plus 7215559-7215663 

333709 Dunham, I. etal. Plus 7229730-7229835 

333747 Dunham, I. etal. Plus 7605884-7606206 

10 333774 Dunham, I. etal. Plus 7716509-7716636 

333775 Dunham, I. etal. Plus 7729983-7730149 

333806 Dunham, I. etal. Plus 7877475-7877666 

333843 Dunham, I. etal. Plus 7978762-7978887 

333854 Dunham, I. etal. Plus 8029446-8029524 

IS 333873 Dunham, I. etal. Plus 8133266-8133429 

333880 Dunham, I. etal. Plus 8151823-8152133 

333885 Dunham, t ©tat Plus 8154352-8154437 

333918 Dunham, I. etal Plus 8307124-8307215 

333947 Dunham, I. etai Plus 8579888-8579966 

20 333961 Dunham, tetaL Plus 8617999-8618104 

333981 Dunham, I. etaL Plus 8782374-8782643 

333991 Dunham, I. etaL Plus 8837419-8837551 

333994 Dunham, I. etal Plus 8852749*852894 

334030 Dunham, I. eUL Plus 9288463-9288782 

25 334083 Dunham, I. etaL Plus 9837016-9837081 

334111 Dunham, I. etaL Plus 10279365-10278531 

334135 Dunham, I. etaL Plus 10457085-10457183 

334218 Dunham,!. etaL Plus 12680289-12680378 

334249 Dunham,!. etal. Plus 13190430-13190574 

30 334262 Dunham,!. etaL Plus 13231452-13231581 

334264 Dunham, L etaL Plus 13234447-13234544 

334327 Dunham, I. etal. Plus 13577413-13577496 

334328 Dunham, L etal. Plus 13589868-13589936 
334340 Dunham,!. etal. Plus 13642407-13642522 

35 334454 Dunham, I. eta). Plus 14326506-14326738 

334504 Dunham, I. etal. Plus 14510206-14510398 

334508 Dunham, I. etal. Plus 14514936-14515122 

334512 Dunham,!. etal. Plus 14545933-14546366 

334582 Dunham, I. etaL Plus 15026255-15026371 

40 334659 Dunham, I. etal. Phis 15460624-15460726 

334721 Dunham, I. etaL Phis 15796816-15796987 

334723 Dunham, L etaL Plus 15805317-15805399 

334730 Dunham, I. etal. Plus 15967830-15967834 

334774 Dunham,!. etal. Plus 16251857-16252178 

45 334778 Dunham, I. etal. Plus 16276180-16276395 

334851 Dunham, I. etaL Plus 17820110-17820810 

334885 Dunham,!. etal Phis 19233667-19233787 

334902 Dunham,!. etal. Phis 19317083-19317195 

334905 Dunham,!. etaL Phis 19322553-19322680 

50 334906 Dunham, I. etal. Phis 19323493-19323590 

334910 Dunham, I. etal. Plus 19398155-19398684 

335018 Dunham, I. etal. Plus 20688288-20688415 

335025 Dunham, I. etaL Plus 20743941-20744050 

335033 Dunham, I. etaL Plus 20753188-20753314 

55 335044 Dunham, I. etaL Plus 20842088-20842682 

335142 Dunham,!, etaL Plus 21465105-21465186 

335157 Dunham, I. etaL Plus 21543302-21544341 

335160 Dunham, I. etaL Plus 21573388-21573497 

335174 Dunham, I. etaL Plus 21631301-21631447 

60 335188 Dunham,!. etaL Plus 21669118-21669328 

335180 Dunham, I. etal. Plus 21680807-21680876 

335191 Dunham, I. etal. Plus 21681110-21681183 
335193 Dunham,!. etaL Plus 21692208-21692362 
335204 Dunham, L etal. Plus 21750636-21750726 

65 335222 Dunham, I. etal. Plus 21885542-21885608 

335226 Dunham, I. etal. Plus 21890838-21890930 

335227 Dunham,!. etal. Plus 21892145-21892289 

335309 Dunham, I. etal. Phis 22500158-22600276 

335310 Dunham, I. etal. Plus 22500714-22500831 
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335311 Dunham, I. etal. Plus 
335355 Dunham, I. eld. Phis 
335362 Dunham, I. etal. Plus 
335368 Dunham, I. etal. Plus 
5 335384 Dunham, I. etal. Plus 
335385 Dunham, I. etal. Plus 
335436 Dunham, I. etal. Plus 

335440 Dunham, I. etal. Plus 

335441 Dunham, I. etal. Plus 
10 335450 Dunham, I. etal. Plus 

335453 Dunham, I. etal. Plus 
33545B Dunham, I. etal. Plus 
335464 Dunham, I. etal. Plus 
335496 Dunham, I. etal. Plus 
IS 335497 Dunham, I. etal. Plus 

335498 Dunham, I. etal. Plus 

335499 Dunham, I. etal. Plus 

335500 Dunham, I. eta). Plus 
335507 Dunham, I. etal. Plus 

20 335510 Dunham, I. etal. Plus 
335513 Dunham, I. etal. Plus 
335627 Dunham, I. etal. Plus 
335651 Dunham, I. etal. Plus 
335655 Dunham, I. etal. Plus 

25 335656 Dunham, I. etal. Plus 
335658 Dunham, I. etal. Plus 
335663 Dunham,!. etal. Plus 
335665 Dunham, tetat Plus 
335667 Dunham, I. etal. Plus 

30 335668 Dunham, I. etal. Plus 

335689 Dunham, I. etal. Plus " 

335690 Dunham, I. etal. Plus 
335715 Dunham, I. etal. Plus 
335719 Dunham, tetat Plus 

35 335734 Dunham, I. etat Plus 
335744 Dunham, I. etat Plus 
335809 Dunham, I. etat Plus 
335819 Dunham, I. etal. Plus 
335822 Dunham, I. etal. Plus 

40 335872 Dunham, I. etal. Plus 
335885 Dunham, I. etaf. Plus 
335968 Dunham,!. etal. Plus 
335971 Dunham, I. etat Plus 
335975 Dunham, I. etal Plus 

45 335976 Dunham, I. etal. Plus 

335989 Dunham, I. etat Plus 

335990 Dunham, I. etal. Plus 
336010 Dunham, I. etal. Plus 
336093 Dunham, tetat Plus 

50 336126 Dunham, I. etal. Plus 
336129 Dunham, tetat Plus 

336187 Dunham, tetat Plus 

336188 Dunham, I. etat Plus 
336225 Dunham, I. etat Plus 

55 336371 Dunham, tetat Plus 
336373 Dunham, I. etal. Plus 
336377 Dunham,!, etal. Plus 
336380 Dunham, I. etal. Plus 
336383 Dunham, I. etal. Plus 

60 336384 Dunham, I. etat Plus 
336335 Dunham, tetat Plus 
336386 Dunham, I. etal. Plus 
336441 Dunham, t etal. Plus 
336444 Dunham, I. etat Plus 

65 336484 Dunham, I. etat Plus 
336497 Dunham, tetat Plus 
336499 Dunham, tetat Plus 
336503 Dunham, tetat Plus 
336548 Dunham, I. etat Plus 



22501602-22501676 

22779222-22779516 

22809167-22809461 

22843040-22843184 

22918150-22918263 

22919072-22919339 

23427793-23427923 

23458702-23459017 

23460632-23460724 

23480190-23480270 

23483333-23483459 

23490034-23490143 

23500331-23500496 

24164386-24164545 

24167688-24167869 

24172082-24172161 

2417669B-24176869 

24178236-24178326 

24219973-24220039 

24222975-24223118 

24224272-24224496 

25150005-25150061 

25317560-25317696 

25333211-25333369 

25333601-25333751 

25336315-25336406 

25342680-25342802 

25344098-25344287 

25345735-25345856 

25346313-25346447 

25454350-25454604 

25455442-25455625 

25565941-25566052 

25593936-25594101 

25688723-25688869 

25716483-25716615 

26310772-26310909 

26356341-26356470 

26364087-26364196 

26820760-26820943 

26933436-26933534 

27743843-27744029 

27752808-27753017 

27801321-27801391 

27809041-27809187 

27983788-27983860 

27988532-27988608 

28570239-28570330 

29556922-29557002 

30057891-30058105 

30062259-30062348 

30433494-30433585 

30434870-30435004 

30833614-30833788 

33968108-33968204 

33976308-33976504 

33994489-33994599 

33995323-33995434 

34005784-34005964 

34007429-34007559 

34007879-34008159 

34012865-34013115 

34187606-34187663 

34190585-34190718 

34237425-34237505 

34267190-34267245 

34267504-34267572 

34271306-34271372 

34353861-34354826 
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336552 Dunham, I. etal. Plus 

336553 Dunham, I, etal. Pius 

336567 Dunham, I. eUt. Plus 

336568 Dunham, I. etal. Plus 
336659 Dunham, I. etai. Pius 
336715 Dunham, I. etai. Pius 
336603 Dunham, I. etal Plus 
336805 Dunham, I. etai. Pius 
336650 Dunham, I. etai. Pius 
336657 Dunham, I. etai. Pius 
336911 Dunham, I. eta). Pius 

336949 Dunham, I. etai. Pius 

336950 Dunham, I. etai. Plus 
336958 Dunham, I. etai. Plus 
336993 Dunham, I. etai. Plus 
337076 Dunham, I. etai. Plus 
337109 Dunham, I. etai. Plus 
337123 Dunham, I. etai. Plus 
337151 Dunham, I. etai. Plus 
337189 Dunham, I. etai. Plus 
337241 Dunham, I. etai. Plus 
337337 Dunham, I. etai. Plus 
337353 Dunham, I. etai. Plus 
337384 Dunham, I. etai. Pius 
337398 Dunham, 1. etai. Plus 
337414 Dunham, I. etai. Plus 
337418 Dunham, I. etai. Plus 
337461 Dunham, I. etai. Plus 
337480 Dunham,!. etai. Plus 

337482 Dunham, I. etai. Plus 

337483 Dunham, I. etai. Plus 
337490 Dunham, I. etai. Plus 
337522 Dunham, I. etaL Pius 
337532 Dunham, I. etai. Pius 
337552 Dunham, I. etai. Pius 
337584 Dunham, I. etai. Plus 
337611 Dunham, I. etai. Plus 
337672 Dunham, t. etai. Pius 
337693 Dunham, I. etat Pius 
337738 Dunham,!. etai: Pius 

337926 Dunham, I. etal. Plus 

337927 Dunham, 1. etal. Plus 
337935 Dunham, t. etai Pius 
337944 Dunham, I. etat Pius 
337954 Dunham, I. etat Plus 
337996 Dunham, I. etal Plus 
338004 Dunham, I. etai. Plus 
338016 Dunham, I. etal Plus 
338174 Dunham, I. etal. Plus 
338176 Dunham, I. etal Plus 
338238 Dunham, I. etal Plus 
338277 Dunham, I. etal Plus 
338294 Dunham, I etal Pius 
338316 Dunham, I etal. Pius 

338323 Dunham, I. etal Plus 

338324 Dunham, I. etal Plus 
338386 Dunham, I. etal Plus 
338398 Dunham, I. etal Plus 
338410 Dunham, I. etal. Plus 
338414 Dunham, I. etal. Pius 
338460 Dunham, 1. etal. Plus 
338481 Dunham, 1. etai. Pius 
338489 Dunham, I. etal. Plus 
338500 Dunham, I. etat Plus 
338514 Dunham, I etal Plus 
338530 Dunham, I. etal Plus 
338620 Dunham, I etal, Pius 
338631 Dunham, I. etal. Plus 
338653 Dunham, I. etal. Pius 



34356420-34356527 

34356683-34356753 

34426228-34428395 

34428521-34428637 

1896402-1896478 

3110198-3110314 

6106904-6106990 

6126661-6126766 

7745284-7745355 

8130457-8130612 

11035818-11035984 

12818687-12618891 

12875843-12875912 

13203550-13203973 

15096270-15096324 

19338177-19338679 

21166580-21166650 

22052874-22052942 

23106433-23106510 

24225887-24225954 

27280182-27280313 

30395182-30395285 

30804624-30804780 

31333399-31333580 

'31585902-31586067 

31953012-31953205 

32014049-32014131 

32803968-32804028 

33219714-33219779 

33227865-33227946 

33237292-33237427 

33318571-33318644 

33963188-33963979 

34187269-34187366 

19497-19600 

945236-945452 

1482883-1483016 

3331236-3331313 

3575975-3576153 

3865738-3865814 

6286377-6286470 

634303^343172 

6534661-6534782 

6589383-6589450 

6831433-6831620 

7445532-7445633 

7601363-7601520 

7863131-7863310 

12771102-12771268 

12774072-12774223 

14661936-14662015 

16167622-16167962 

16463958-16464539 

1708971M7089988 

17154655-17154792 

17155309-17155574 

18611213-18611407 

18953492-18953581 

19292807-19292916 

19345573-19345660 

20233372-20233488 

20942659-20942873 

21142605-21143049 

21253847-21253974 

2137942021379555 

21636361-21636509 

23540239-23540334 

23711167-23711241 

24219427-24219509 
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338660 Dunham, I. etal. Plus 24387122-24387268 

338704 Dunham, I. etal. Plus 25230432-25230548 

338847 Dunham, I. etaL Plus 27995337-27995420 

338887 Dunham, I. etal. Plus 28465244-28465384 

5 338895 Dunham, I. etal. Plus 28598893-28599135 

338916 Dunham, I. etal. Plus 28824881-28824977 

338925 Dunham, I. etal. Plus 28883892-28884036 

338938 Dunham, I. eUl. Plus 29148022-29148160 

338952 Dunham, L etal. Plus 29418831-29418968 

10 338980 Dunham, I. etal. Plus 29896789-29896874 

338981 Dunham, I. etal. Plus 29897917-29898008 

338986 Dunham, I. etal. Plus 30007287-30007415 

339009 Dunham, I. etaL Plus 30348477-30348598 

339017 Dunham, I. etal Plus 30420896-30421090 

15 339045 Dunham, I. etaL Plus 30744286-30744356 

339046 Dunham, I. etaL Plus 30746269-30746420 

339059 Dunham, I. etaL Plus 30814655-30814801 

339067 Dunham, I. etal Plus 30869347-30869412 

339069 Dunham, I. etal. Plus 30880975-30881070 

20 339078 Dunham, I. etal. Plus 30914310-30914423 

339084 Dunham, I. etal. Plus 30944556-30944803 

339101 Dunham,!. etal. Plus 31158047-31158123 

339102 Dunham, L etal. Plus 31169321-31169563 

339103 Dunham,!. etaL Plus 31 170343-31 170454 
25 339115 Dunham, L eta!. Plus 31459869-31459927 

339157 Dunham, I. etaL Plus 32131701-32131833 

339166 Dunham, I. etal. Plus 32210902-32211006 

339167 Dunham, I. etaL Plus 32213567-32213730 
339288 Dunham,!. etaL Plus 33169611-33169691 

30 339289 Dunham, I. etal Plus 33186756-33186903 

339291 Dunham, I. etal Plus 33205057-33205247 

339407 Dunham, I. etaL Plus 34189461-34189620 

332865 Dunham, I. etal. Minus 1391482-1391218 
332881 Dunham, tetal. Minus 1563520-1563184 

35 332930 Dunham, I. etal. Minus 2022565-2022497 

332931 Dunham, I. etal. Minus 2023651-2023562 

332984 Dunham, I. etal. Minus 2632606-2632457 

332986 Dunham, I. etal. Minus 2635398-2635206 

332997 Dunham, L etal. Minus 2710509-2710375 

40 333051 Dunham, I. etal. Minus 2991973-2991840 

333081 Dunham,!. etal. Minus 3029631-3029527 

333064 Dunham, I. etal. Minus 3030722-3030623 
333096 Dunham, tetal. Minus 3184234-3184118 
333099 Dunham, tetal. Minus 3206796-3206674 

45 333106 Dunham,!. etal. Minus 3230744-3230547 
333160 Dunham, I. etal. Minus 3654893-3654678 
333163 Dunham, I. etal. Minus 36651244634962 

333165 Dunham,!. etal. Minus 3674052-3673905 

333166 Dunham,!. etal. Minus 36946644694567 
50 333170 Dunham, I. etal. Minus 37333944733299 

333174 Dunham, I. etal. Minus 3764284-3764210 
333188 Dunham, I. etal. Minus 38269904826863 
333214 Dunham, I. etal. Minus 39665594966437 
333232 Dunham, I. etaL Minus 4001551-4001365 

55 333237 Dunham, L etal. Minus 40033264003219 
333239 Dunham, I. etal Minus 4095861-4094462 
333255 Dunham, I. etaL Minus 42978834297716 
333259 Dunham, I. etaL Minus 43067694306639 
333274 Dunham, I. etaL Minus 43891464388954 

60 333290 Dunham, I. etaL Minus 45307344530554 

333295 Dunham, I. etal. Minus 45492904549198 

333296 Dunham, tetal. Minus 45507664550644 

333310 Dunham, I. etaL Minus 46373154637232 

333311 Dunham, I. etal. Minus 46379334637844 
65 333312 Dunham, I. etal. Minus 46387944638635 

333313 Dunham, L etal. Minus 46393974639277 
333316 Dunham, I. etal. Minus 5405980-5405876 
333318 Dunham, I. etal. Minus 46426364642564 
333321 Dunham, I. etal. Minus 46490804648934 



WO 02/30268 PCT/USO 1/32045 



333327 Dunham, t. etat. Minus 4657947-4657828 

333335 Dunham, I. etal. Minus 4672656-4872564 

333337 Dunham, LeUI. Minus 46779304677841 

333454 Dunham, I. eUL Minus 51370073136880 • 

5 333458 Dunham, I. eta!. Minus 6143942-5143806 

333459 Dunham, I. eUL Minus 5144548-5144344 

333470 Dunham, I. eLai. Minus 5223319-5223088 

333493 Dunham, LeUL Minus 46373154637232 

333496 Dunham, I. GUI. Minus 54046433404523 

10 333498 Dunham, I. eta). Minus 5405980-5405876 

333510 Dunham, I. elal. Minus 55576283557469 

333546 Dunham, LeUI. Minus 58866433886442 

333561 Dunham, LeUI. Minus 5903659-5903590 

333738 Dunham, LeUI. Minus 7552160-7552084 

15 333780 Dunham, LeUI. Minus 7750367-7750277 

• 333783 Dunham, LeUI. Minus 7751850-7751777 

333818 Dunham, LeUI. Minus 7911959-7911762 

333894 Dunham, LeUI. Minus 81888553188709 

333897 Dunham, I.eUL Minus 81943903194284 

20 333900 Dunham, LeUI. Minus 82002683200122 

333909 Dunham, LeUI. Minus 82296393229477 

333936 Dunham, LeUI. Minus 85128053512564 

333944 Dunham, LeUI. Minus 85570513556938 

334040 Dunham, LeUI. Minus 9342995-9342934 

25 334154 Dunham, LeUI. Minus 10570714-10570572 

334178 Dunham, LeUI. Minus 11755052-11754971 

334188 Dunham, LeUI. Minus 11925963-11925834 

334273 Dunham, LeUI. Minus 13265608-13265522 

334282 Dunham, LeUI. Minus 13285293-13285178 

30 334285 Dunham, LeUI. Minus 13289990-13289793 

334286 Dunham, LeUI. Minus 13291759-13291569 

334303 Dunham, LeUI. Minus 13454331-13454217 

334305 Dunham, LeUI. Minus 13456310-13456209 

334306 Dunham, LeUI. Minus 13461157-13461049 
35 334320 Dunham, I. eUl. Minus 13496857-13496717 

334352 Dunham, LeUI. Minus 13675908-13675828 

334353 Dunham, LeUI. Minus 13683722-13683596 
334359 Dunham, LeUI. Minus 13728664-13728534 
334363 Dunham, LeUt. Minus 13740004-13739812 

40 334365 Dunham, LeUI. Minus 13742078-13741971 

334399 Dunham, LeUI. Minus 14186289-14186163 

334409 Dunham, I. eUL Minus 14195181-14195075 

334414 Dunham, LeUI. Minus 14234033-14233932 

334470 Dunham, LeUI. Minus 1438958M4389442 

45 334483 Dunham, L eUl. Minus 14428355-14428281 

334489 Dunham, LeUI. Minus 14455428-14454288 

334498 Dunham, LeUI. Minus 14483789-14483700 

334501 Dunham, I.eUL Minus 14487509-14487356 

334502 Dunham, I. eLai. Minus 14488605-14488526 
50 334543 Dunham, L eUL Minus 14834496-14834116 

334622 Dunham, I. eUL Minus 15191678-15191609 

334650 Dunham, I.eUL Minus 15371251-15371178 
334680 Dunham, I. eLai Minus 15520047-15519887 
334745 Dunham, I. eLai. Minus 16049960-16049653 

55 334756 Dunham, Letal. Minus 16128678-16128528 
334758 Dunham, LeUI. Minus 16132368-16132233 
334761 Dunham, LeUI. Minus 16138424-16138319 
334763 Dunham,!. elal. Minus 16148136-16148077 
334784 Dunham, LeUL Minus 16294548-16294360 

60 334790 Dunham, I. eUL Minus 16307576-16307509 
334793 Dunham, LetaL Minus 16330748-16330681 
334802 Dunham, LetaL Minus 16413158-16413026 
334820 Dunham, LetaL Minus 16764338-16764249 
334824 Dunham, LetaL Minus 16857777-16857674 

65 334832 Dunham, LetaL Minus 17173957-17173760 
334842 Dunham, LeUI. Minus 17464352-17464181 
334844 Dunham, LeUL Minus 17503891-17503768 
334657 Dunham, LeUL Minus 18488368-18488242 
334927 Dunham, LeUI. Minus 19988711-19987853 
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334939 Dunham, I. et.al. Minus 20131162-20131054 

334951 Dunham, Ut.al. Minus 20147708-20147502 

334969 Dunham, Letal. Minus 20188176-20188020 

334972 Dunham, tela!. Minus 20294734-20294611 

S 335050 Dunham, Letal Minus 20884109-20883951 

335078 Dunham, I. etal Minus 21059529-2t05945B 

335102 Dunham, I. etal. Minus 21313841*21313598 

335105 Dunham, I. etal. Minus 21320563-21320440 

335110 Dunham, I. etal. Minus 21334136-21333811 

10 335111 Dunham, I. etal. Minus 21335946-21335809 

335115 Dunham, I. etal. Minus 21388250-21388146 

335116 Dunham, I. etal. Minus 21388573-21388414 

335185 Dunham, I. etat. Minus 21651593-21651522 

335186 Dunham, I etal. Minus 21658438-21656338 
15 335230 Dunham,!. etal. Minus 21899517-21898678 

335236 Dunham, I. etal. Minus 21915016-21914870 

335243 Dunham, I. etal. Minus 21933519-21933365 

335249 Dunham, 1. etat Minus 21950851-21950669 

335258 Dunham, Letal. Minus 22043431-22043262 

20 335261 Dunham, I. etal. Minus 22063937-22063772 

335276 Dunham, I. etal. Minus 22154036-22153937 

335279 Dunham, Letal. Minus 22168834-22168638 

335330 Dunham, Letal. Minus 22556589-22556422 

335331 Dunham, Letal. Minus 22556823-22556708 
25 335334 Dunham, Letal. Minus 22560390-225601% 

335346 Dunham, I. etal. Minus 22641097-22640918 

335349 Dunham, Letai. Minus 22661861-22661271 

335611 Dunham, Letal. Minus 25070825-25070706 

335612 Dunham, Letal. Minus 2507232B-25072H2 
30 335671 Dunham, Letal. Minus 25358629-25358533 

335676 Dunham, Letal. Minus 25395274-25395152 

335680 Dunham, I. et.al. Minus 25402437-25402361 

335750 Dunham, Letal. Minus 25732501-25731972 

335752 Dunham, Letal. Minus 25757026-25756890 

35 335755 Dunham, Letal. Minus 25763806-25763747 

335767 Dunham, Letal. Minus 25819547-25B19218 

335774 Dunham, I. etal. Minus 25883733-25883572 

335777 Dunham, 1. etal. Minus 25885770-25885599 

335778 Dunham, Letal. Minus 25886469-25886334 
40 335797 Dunham, Letal. Minus 25958182-25956030 

335800 Dunham, I. eta!. Minus 25985373-25985280 

335818 Dunham, I. etal. Minus 26323886-26323744 

335834 Dunham, Letal. Minus 26391707-26391530 

335840 Dunham, I. etal. Minus 26420596-26420538 

45 335844 Dunham, Letal. Minus 26433427-26433344 

335846 Dunham, I. etal. Minus 26436727-26436621 

335856 Dunham, Letal. Minus 26662452-26662346 

335887 Dunham, I. etal. Minus 26939225-26938782 

335888 Dunham, I. etal. Minus 26943037-26942820 
50 335889 Dunham, Letal. Minus 26946988-26946901 

335890 Dunham, I. etal. Minus 26949087-26948665 

335893 Dunham, I. etal. Minus 26973898-26973747 

335895 Dunham, Letal Minus 26975307-26975239 

335896 Dunham, Letal. Minus 26977639-26977558 
55 335900 Dunham, LetaL Minus 26980354-26980238 

335907 Dunham, LetaL Minus 27013352-27013273 

335943 Dunham, I. etal Minus 27446610-27446378 

335956 Dunham, I. etal Minus 27653729-27653635 

335959 Dunham, I. etal Minus 27682313-27682145 

60 335962 Dunham, LetaL Minus 27704276-27704144 
336040 Dunham, Letal Minus 29036458-29036300 
336044 Dunham, Letal Minus 29043828-29043727 
336047 Dunham, Letal. Minus 29050617-29050466 
336068 Dunham, I. etal. Minus 29252077-29251969 

65 336143 Dunham, Letal. Minus 301 35948-301 35B54 
336158 Dunham, Letal. Minus 30163730-30163610 
336174 Dunham, LetaL Minus 30241988-30241639 
336223 Dunham, Letal Minus 30816306-30316195 
336245 Dunham, I. etal Minus 31420569-31420509 
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336274 Dunham, LetaL Minus 32085468-32035303 

336318 Dunham, I. etaL Minus 33364452-33364338 

336326 Dunham, tetat Minus 33567328-33567201 

336339 Dunham, I. eta!. Minus 33798479-33798330 

5 336340 Dunham, ULai. Minus 33812069-33811915 

336355 Dunham, I. eta). Minus 33874750-33874649 

336392 Dunham, I. eUl. Minus 34015868-34015736 

336393 Dunham, I. etal. Minus 34016145-34015951 

336394 Dunham, I. etal. Minus 34016457-34016298 
10 336400 Dunham, I. etaL Minus 34023437-34023298 

336402 Dunham, I. eUl. Minus 34024090-34023981 

336413 Dunham, I. eUl. Minus 34046702-34046576 

336424 Dunham, I eLai. Minus 34055549-34055491 

336425 Dunham, I. eLai. Minus 34058544-34058446 
IS 336437 Dunham, I. eta!. Minus 34074154-34074090 

336447 Dunham, I. eUl. Minus 34198207-34197996 

336449 Dunham, KetaL Minus 34204707-34204577 

336466 Dunham, I. eLai. Minus 34213195-34213046 

336492 Dunham, I. etaL Minus 34255578-34255437 

20 336511 Dunham, I. etaL Minus 34277480-34277351 

336512 Dunham, I. etaL Minus 34278373-34278275 

336520 Dunham, I. etaL Minus 34319184-34319101 

336522 Dunham, I. etaL Minus 34320169-34320056 

336524 Dunham, I. etaL Minus 34321055-34320921 

25 336527 Dunham, I. etaL Minus 34322071-34321966 

336534 Dunham, I. etaL Minus 34326797-34326620 

336536 Dunham, tetai. Minus 34327678-34327538 

336542 Dunham, tetai. Minus 34331316-34331183 

336556 Dunham, tetai. Minus 34375244-34374907 

30 336557 Dunham, I. etal. Minus 34375443-34375341 

336558 Dunham, I. eta). Minus 3437582544375698 

336559 Dunham, 1. etaL Minus 3437643044376261 

336560 Dunham, tetai. Minus 34376814-34376596 

336561 Dunham, tetai Minus 34377168-34376928 
35 336597 Dunham, t etaL Minus 7627912-7627757 

336601 Dunham, L etal. Minus 13265853-13265654 

336642 Dunham, t etaL Minus 1304281-1304212 

336645 Dunham,!. etaL Minus 1351268-1351168 

336662 Dunham, t etaL Minus 2158060-2157993 

40 336664 Dunham, L etaL Minus 1993558-1993481 

336876 Dunham, tetai. Minus 2022565-2022497 

336684 Dunham, tetai. Minus 2158060-2157093 

336686 Dunham, tetai. Minus 2160698-2160486 

336714 Dunham, tetai. Minus 30940264093871 

45 336719 Ounham, tetai. Minus 33316314331503 

336736 Dunham, tetai. Minus 4093128-4093041 

336744 Dunham, tetai. Minus 4333001-4332848 

336788 Dunham, I. etal. Minus 54199734419873 

336793 Dunham, I. etal. Minus 56313454631237 

50 336859 Dunham, LetaL Minus 82017564201561 

336863 Dunham, LetaL Minus 63966734396425 

336933 Dunham, tetai. Minus 11760045-11759981 

336942 Dunham, I. etal. Minus 12027537-12027455 

336960 Dunham, tetai. Minus 13267243-13267172 

55 336969 Dunham, I. etal. Minus 13725722-13725643 

336971 Dunham, I. etal. Minus 13732308-13732221 

337003 Dunham, tetai. Minus 15523541-15523422 

337011 Dunham, I. etal. Minus 16106423-16106080 

337070 Dunham, LetaL Minus 19034423-19034321 

60 337072 Dunham, LetaL Minus 19077452-19077323 

337086 Dunham, LetaL Minus 19857011-19656881 

337140 Dunham, I. etal. Minus 22649450-22649388 

337193 Dunham, I. etaL Minus 24594969-24594874 

337256 Dunham, LeUi. Minus 27659956-27659876 

65 337278 Dunham, I. etaL Minus 28429017-28428848 

337284 Dunham, I. etal. Minus 28491414-28491094 

337293 Dunham, tetai. Minus 28846334-28845873 

337316 Dunham, Letat Minus 29657129-29656997 

337326 Dunham,!. etaL Minus 30017199-30017069 
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337382 Dunham, Letal. Minus 31233666-31233579 

337392 Dunham, I. eLal. Minus 31442311-31442229 

337406 Dunham, Letal. Minus 31864640-31864588 

337412 Dunham, Letal. Minus 31916487-31916312 

5 337419 Dunham, UtaL Minus 32021496-32021170 

337438 Dunham, I. etaL Minus 32257869-32257739 

337455 Dunham, I. etaL Minus 32434517-32434425 

337509 Dunham, I. etaL Minus 33414613-33414498 

337518 Dunham, Letal. Minus 33796750-33796647 

10 337529 Dunham, Letal. Minus 34043668-34043546 

337533 Dunham, Letal. Minus 34193388-34193261 

337539 Dunham, L etaL Minus 34254490-34254322 

337551 Dunham, Letal. Minus 34524446-34524362 

337553 Dunham, Letal. Minus 24230-24160 

IS 337591 Dunham, Letal. Minus 1006414-1006184 

337592 Dunham, Letal. Minus 1007791-1007634 

337593 Dunham, Letat Minus 1009480-1009291 
337607 Dunham, Letal. Minus 1355719-1355637 
337612 Dunham, Letal. Minus 1570235-1570142 

20 337635 Dunham, I. etat Minus 2169690-2169569 

337824 Dunham, Letal. Minus 45595404559266 

337825 Dunham, Letal. Minus 45671554567005 
337850 Dunham, Letal. Minus 5077143-5076943 
337854 Dunham, Letal. Minus 5153435-5153272 

25 337913 Dunham, Letal. Minus 6149343-6149786 

337915 Dunham, Letal. Minus 5922748-5922690 

337968 Dunham, I. etat. Minus 7095797-7095680 

338010 Dunham, I. etal. Minus 7754282-7754184 

338012 Dunham, I. etal. Minus 7761421-7761351 

30 338017 Dunham, Letal. Minus 7864521-7864401 

338065 Dunham, Letal. Minus 7235048-7234950 

338094 Dunham, 1. etal. Minus 9595602-9595440 

338129 Dunham, Letal. Minus 10915338-10915237 

338132 Dunham, I. etal. Minus 10989617-10989530 

35 338150 Dunham, Letal. Minus 11478551-11478355 

338157 Dunham, I. etaL Minus 11731444-11731375 
338185 Dunham, Letal. Minus 13484103-13483972 
338255 Dunham, I. etal. Minus 15242294-15242231 
338276 Dunham, Letal. Minus 16109555-16109398 

40 338431 Dunham, Letal. Minus 19747608-19747496 

338448 Dunham, Letal. Minus 20151152-20151054 

338451 Dunham, I. etal. Minus 20174266-20174193 

338477 Dunham,!. etaL Minus 20821897-20821838 

338534 Dunham, L etaL Minus 21771238-21771170 

45 338682 Dunham, L etat Minus 24800712-24800481 

338684 Dunham, L etaL Minus 24827522-24827428 
338689 Dunham, L etal. Minus 24893073-24892972 
338695 Dunham, Letal Minus 25104153-25104016 
338825 Dunham, I. etaL Minus 27664798-27664712 

50 338842 Dunham, L etaL Minus 27824238-27824079 
338893 Dunham, L etal Minus 26491807-28491631 
338904 Dunham, I. etaL Minus 28766345-28766253 
338935 Dunham, I. etat. Minus 29071537-29071461 
339022 Dunham, Letal. Minus 30523414-30523289 

55 339034 Dunham, Letal. Minus 30621603-30621422 
339190 Dunham, I. etal. Minus 32403103-32402985 

339212 Dunham, I. etaL Minus 32494335-32494210 

339213 Dunham, I. etal. Minus 32496590-32496440 
339218 Dunham, I. etal. Minus 32504250-32504109 

60 339233 Dunham, I. etal. Minus 32751331-32751238 
339258 Dunham, Letal. Minus 32934756-32934815 

339262 Dunham, Letal. Minus 32971258-32971090 

339263 Dunham. Letal. Minus 32974634-32974452 
339265 Dunham,!. eLal. Minus 32975943-32975806 

65 339338 Dunham, Letal Minus 33468728-33466606 
339396 Dunham, Letal Minus 3401730644017205 
339400 Dunham, Letal. Minus 34045024-34044940 
339425 Dunham, Letal Minus 34407911-34407798 
325207 6552430 Pius 140049-140170 
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329568 3962490 
329517 3983513 
325313 5866865 
325327 5866875 
5 325317 5866878 
325257 5666895 
329632 6729060 
325371 5866920 
325375 5866920 

10 325378 5866920 
325469 6017034 
925470 6017034 
325576 6552443 
325505 6682451 

15 325543 6682452 

329635 5302817 

329636 5302817 
325593 5866992 
325675 5867014 

20 325704 5867028 
325682 6138923 
325785 6381957 
325666 6469822 
325816 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117856 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6468233 

35 329791 6469354 
325826 5867048 
325829 5867052 
329888 6067149 
329893 6525313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 5867132 

50 325960 5887147 
325961 5887147 

325838 6552452 

325839 6552452 

325840 6552452 
55 325844 6552453 

325870 6682492 
329984 4646193 
329976 4878063 
329935 6165200 

60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
326033 5867178 

65 326036 6867178 
326056 5667184 
326116 5867193 
326122 5867194 
326138 5867203 
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Minus 


156551-156649 


Plus 


10897-10955 


Plus 


192813-193017 


Minus 


1035422-1035536 


Minus 


1 165503-1 iDOOlO 


Minus 


1187981-1188167 


Plus 


286823-286991 


Plus 


287578-287663 


Minus 


137769-137B94 


Minus 


240852-240946 


Plus 


151873-152057 


Minus 


62522-62622 


Minus 


64969-65078 


Minus 


469726-469860 


Plus • 


955517-955711 


Plus 


156188-156387 


Plus 


370618-370763 


Plus 


61849-62003 


Plus 


16769-16857 


Minus 


.120278-120559 


Minus 


191389-191479 


Plus 


118315-118422 


Minus 


37647-37730 


Plus 


158772-158900 


Minus 


22165-22288 


Minus 


142207-142359 


Plus 


101355-101745 


Plus 


131223-131291 


Plus 


131351-131495 


Minus 


105995-106107 


Minus 


131982-132089 


Minus 


46361-46458 


Plus 


232674-233060 


Minus 


37227-37473 


Minus 


166123-166791 


Minus 


111058-111783 


Plus 


17348-17606 


Plus 


276141-276251 


Plus 


149115-149192 


Plus 


155223-155346 


Plus 


194694-194915 


Minus 


8178-8347 


Plus 


78779-78876 


Minus 


329063-329134 


Minus 


152633-152902 


Minus 


162506-162635 


Minus 


165106-165209 


Plus 


171451-171532 


Plus 


181964-182037 


Plus 


184380-184547 


Minus 


14188-14332 


Plus 


228209-228297 


Minus 


139780-139890 


Minus 


62584-62691 


Minus 


69059-69127 


Plus 


36396-37195 


Plus 


120938-121032 


Minus 


1005-1270 


Minus 


30015-30144 


Plus 


37261-37333 


Minus 


120215-120273 


Minus 


181553-181690 


Plus 


45548-45604 


Plus 


144397-144683 


Minus 


179374-179436 



267 



WO 02/30268 



PCTYUS01/32045 



326145 5867204 
326180 5867211 
326201 5867216 
326207 5887222 
5 326226 5867230 
326233 5867232 
326238 5867260 
326241 5867260 
326243 5867261 

10 326251 5867263 
326268 5887267 
326124 5916395 
326339 6056311 
330049 4567182 

15 326358 5867293 
326365 5667297 
326379 5867327 
326382 5867327 
326390 5867340 

20 326424 5867369 
326453 5867399 
326472 5867404 
326492 5867422 
326533 5867441 

25 330117 6015201 

330115 6015202 

330116 6015202 
330095 6015278 
330098 6015276 

30 326644 5867559 
326713 5867595 
326745 5867611 

326752 5867815 

326753 5867616 
35 326598 6867634 

326667 6552455 
326855 6552460 
326812 6682504 
327005 5867664 

40 327008 5867664 
326896 5867680 
326904 5887684 
326951 6004446 
326941 6004446 

45 326943 6004446 
326928 6456762 

326958 6469836 

326959 6469836 
327039 6531965 

50 327127 6682520 
330158 6580367 
327204 5867447 
327208 5867447 
327266 5867462 

55 327277 5867473 
327289 5867481 
327296 5867492 
327237 5867544 
327145 5867548 

60 327333 5902477 
327335 5902477 
327343 6017017 
327350 6249563 
327358 6552411 

65 327360 6552411 
327409 5867750 
327424 5867751 
327430 5687754 
327470 5887772 



Minus 


COCOO COO 4 A 


Minus 


-tOOTCO HQOOOO 

1 0^700* looZZZ 


Minus 


loofbo-iDoyoy 


Plus 


AQilti-AQOIQ 

4ol39-4oZl9 


Plus 


52o44-5Z7Q5 


Plus 


mA^GQ -4 O J OA'S 


Plus 


642S2-04330 


Minus 




Ptus 


123o3o-1239/o 


Minus 


82710-02022 


Pius 


122114-1Z2705 


Plus 


4071Q2-4Q758U 


Minus 


1 04637-1 65201 


Minus 


o14oo2-oioZlQ 


Plus 


9 122-91 bo 


Minus 


96630-96764 


Ptus 


32299-32402 


Minus 


50420*50303 


Minus 


108814-110582 


Minus 


4MAAA 4 t+CI A /V* 

168329-168409 


Plus 


86222-86423 


Plus 


293739-293940 


Ptus 


120768-120991 


Minus 


.532153-532280 


Minus 


7340*7680 


Plus 


11403-11677 


Plus 


12109-12418 


Ptus 


15343-15814 


Plus 


49370-49458 


Plus 


42684-42819 


Plus 


121511-121798 


Plus 


127130-127318 


Minus 


1214-1562 


Plus 


12454-12511 


Plus 


68955-69014 


Plus 


142311-142441 


Minus 


111390-111463 


Plus 


189811-189941 


Plus 


610847-610907 


Plus 


928737-928811 


Minus 


12032-12122 


Minus 


9280-9606 


Plus 


193812-193998 


Plus 


62018-62896 


Minus 


89242*89427 


Minus 


291007-291219 


Minus 


42952-43082 


Minus 


43159-43301 


Plus 


694486-694998 


Plus 


41925-42083 


Ptus 


81966-82456 


Plus 


165135-165239 


Plus 


180805-180864 


Minus 


82400-82615 


Minus 


165616-165715 


Plus 


49296-49536 


Plus 


7627*8166 


Minus 


59702-59813 


Minus 


40482-40551 


Minus 


141448-141609 


Minus 


142979-143124 


Minus 


12288-12395 


Minus 


4189041965 


Minus 


3802-3950 


Minus 


6255-6422 


Minus 


52949-53011 


Plus 


160442-160598 


Pius 


1320-1403 


Plus 


150910-150973 
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327460 6004455 
327498 6017023 

327509 6117815 

327510 6117815 
5 327512 6117815 

327535 6525279 
330163 6042042 
330171 6648220 
327579 6867824 

10 327672 5867843 
327629 5867872 
327640 5867890 
327649 5867899 
327612 6525283 

15 327718 6525284 
327801 5867924 

327762 5867961 

327763 5867961 
327776 5867964 

20 327622 5867968 
327823 5867968 
327807 5867968 
327845 6531962 
330228 6013527 

25 330190 6165182 
328122 5868031 
328132 5868038 
328159 5868065 
328168 5868071 

30 328175 5868073 
328217 5868096 
327665 5868130 
327866 5868131 
327870 5868131 

35 327879 5868142 
327902 5868156 
327918 5868165 
327934 5868184 
327959 5868210 

40 327976 5868212 
328020 5902482 
328042 5902482 
328008 5902482 
330301 2905862 

45 330299 2905881 
328274 5868219 
328595 5868224 
328591 5868227 
328668 5868254 

50 328677 5868256 
328687 5868262 
328706 5868270 
328711 5868271 
328730 5868289 

55 328732 5868289 
328734 5868289 
328752 5868298 
328755 5868301 
328761 5868302 

60 328775 5868309 
328784 5868309 
328787 5868309 
328809 5868327 
328829 5868337 

65 328280 5868352 
328311 5868371 
328318 5868373 
328323 5868373 
328348 5868383 



Phis 


1 focHO'l /00**0 


Minus 




Minus 


CjIOQO ccaeq 

54oo2«55U5o 


Minus 


5oo24-00944 


Plus 


170Z00-17o3Z0 


Plus 


19105*19175 


Minus 


20321 •20385 


Phis 


110889-1 llo/o 


Minus 


37229-00335 


Minus 


e ft© A fljs or A t\ 


Plus 


49q92-48811 


Plus 


OA AO ncofl 
8440-9000 


Plus 


205871 -205927 


Plus 


2747-2824 


Plus 


00404 OCi DC 


Plus 


23239-23348 


Minus 


enono en a in 

50303-50439 


Plus 


nnfi'tAi tints At O. 

229347-229476 


Minus 


4 O Anno 4OAA0O 

104300-104400 


Minus 


168880-1 69633 


Minus 


170359*170433 


Plus 


33745-33811 


Plus 


*t\f%Ann 4 aop Af\ 

193402*193549 


Minus 


.3718-3787 


Plus 


36103-36243 


Plus 


158474*158656 


Minus 


126737-126839 


Minus 


52957-53162 


Plus 


60321-60478 


Plus 


208-271 


Minus 


3742*4362 


Plus 


61503-62205 


Minus 


2893-3046 


Plus 


53558-53757 


Minus 


77722-77793 


Minus 


133339-133467 


Plus 


547530-547591 


Plus 


41830-42036 


Minus 


46497*46682 


Minus 


349301-349409 


Minus 


556386-556652 


Minus 


1985085-1986626 


Plus 


296663-297151 


Minus 


4420-5761 


Minus 


1020-1382 


Minus 


31244-31439 


Plus 


148738-148967 


Minus 


237647-237726 


Minus 


10888-10984 


Minus 


58708-58950 


Plus 


624479-624585 


Plus 


165501-165614 


Minus 


97797-97990 


Plus 


8068-8214 


Plus 


37437-37550 


Phis 


50559-50747 


Minus 


114911*115087 


Minus 


145959-146446 


Minus 


239308-239412 


Phis 


12845-12920 


Minus 


74523-74604 


Phis 


135772*135963 


Plus 


91792-91849 


Plus 


36309-36630 


Plus 


160563*160631 


Minus 


170560-170826 


Plus 


414845-415620 


Minus 


1080089-1080235 


Minus 


260272-260379 
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326377 586B390 


Plus 


1AQA7.17A9Q 


328436 5868417 


Plus 


O/Y17Cf\_9/T30fi/1 

2Qo7olr^U09U4 


326504 5868471 


Plus 




328506 5868471 


Plus 


DUM 0*01)0 OU 


328522 5868477 


Plus 


1 y/COUf -1 of cAqc 


328525 5868482 


Q|,,. 

Plus 


1 9^07.1 
ItOOf" 1*W 10 


329541 5B68486 


Plus 


4 oAQca.1 Qi nrn 


328662 6004473 


Oil in 

PIUS 


litw/Yo-i itwooo 


326663 6004473 


Plus 


1 loD£/y-l 1O00O4 


328803 6004475 


Minus 




328304 6004478 


Minn* 

Minus 




328927 5868500 


Minus 




328936 5868500 


Minus 


1 o022U£-l 052259 


328939 6004481 


Minus 


a oi a 9n_4 9-1 oon 
131 139-131320 


328941 6456765 


Minus 


A047.0QOC 


328946 6456765 


Plus 


28227-Z3413 


328968 6456775 


Plus 


1 17442-110200 


330316 6007576 


Minus 


119761-118931 


330350 3056622 


Minus 


2d413-2oo20 


330351 3056622 


Minus 


275Z2-Z7014 


330348 4544475 


Minus 


19850-18902 


329034 5B68561 


Minus 


32819-32939 


329046 5868569 


Pius 


18971-19030 


329053 5868574 


Pius 


426453426541 


32918B 5668711 


Hint m 

Minus 




329237 5668729 


Plus 


133238-133339 


329276 5868762 


Minus 


222629-222709 


329333 5868806 


Plus 


392666-392746 


329376 5868859 


Plus 


52356-52694 


329384 6868869 


Minus 


116524-116662 


329140 6017060 


Plus 


290842-290905 


329317 6381976 


Plus 


614823-615209 


329319 6381976 


Plus 


721390-721470 


329129 6588026 


Plus 


144569-144712 


329373 6682537 


Minus 


38950-39301 


329412 6682553 


Minus 


68948-69041 


329424 5868879 


Plus 


362198-362344 


329448 5868886 


Pius 


84776-84899 


329449 5868886 


Pius 


97697-97771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 
cancer tissues. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Qenbank accession numbe 

UnfgenelD: Unigene number 

Unlgene Title: Unigene gene title 

R1 : Background subtracted normal prostate : prostate tumor tissue 



Pkey ExAccn UnlgenelD Unigene Title 



R1 

331328 AA281133 Hs.88808 ESTs 18.53 

320875 D60641 Hs.131921 ESTs 1455 

300994 AI251936 Hs.146298 ESTs 12.17 

323461 AA418762 Hs.190044 ESTs A 1055 

301015 AA947682 Hs217173 ESTs; Weakly similar to Chain A; Cdc42hs-Gdp Complex [Ksapiens] 10.17 

319419 AA543096 Hs.13648 ESTs; Highly similar to mitogen-induced [Minusculus] 9-2 

323486 C05278 Hs.168800 ESTs; Moderately simitar to [PYRUVATE DEHYDROGENASE(UPOAMIDE)] 

KINASE ISOZYME 4 PRECURSOR [Ksapiens] 8.87 

324882 AW419080 Hs250645 ESTs 8 

330569 U57796 Hs57679 zinc finger protein 192 7.88 

330126 CH2U>2gi|6Q93735 7.8 

316265 AA737400 Hs.142230 ESTs 7.7 

323045 AA148950 Hs.188836 ESTs 7.64 

320668 R58399 Hs.146217 ESTs 7.4 

330769 AA465192 Hs.16514 ESTs 7.15 

312614 AI766732 Hs201194 ESTs 7 

314790 AW341754 Hs.189305 ESTs 6-83 

309979 AW452118 Hs257533 EST 6.74 

314236 AA743396 Hs.189023 ESTs 6-49 

329192 CH.XJisgt|5868716 6.1 

324307 AA627642 Hs.4994 transducer of ERBB2; 2 (TOB2) 5.99 

303685 AW500106 EST cluster (not In UnlGene) with exon hit 5.82 

314921 AW452382 Hs257564 ESTs 5.B 

315840 AA679001 Hs.192221 ESTs 5.68 

332776 AA034364 HS256551 ESTs; Weakly similar to !ill ALU CLASS B WARNING ENTRY till [H^aplens] 5.43 

313533 AW298141 Hs.157975 ESTs 5.4 

303494 F30712 EST cluster (not in UnlGene) with exon hit 5.35 

317490 AI627358 Hs.148367 ESTs 531 

332546 D84454 Hs21899 solute carrier family 35 (UDP-gaiactose transporter); member 2 525 

334719 CH22_FGENES.421_30 5.25 

300679 AA813958 Hs207727 ESTs; Moderately similar to WAA0071 |H.sapiens] 522 

311811 A1625304 Hs.190312 ESTs 552 

315310 AW511298 Hs256067 ESTs 5 - 1 ? 

312871 H86747 Hs2276G2 KIAA11 16 protein 5.11 

324715 A1739168 EST cluster (not In UnlGene) * 4.97 

313870 AW206435 Hs.146057 ESTs 4.97 

321453 N50080 Hs.1 17827 ESTs 4.78 

316160 AW197887 Hs253353 ESTs 4.63 

313833 AA766825 EST cluster (not In UnlGene) 458 

315850 AW270550 Hs.1 16957 ESTs 453 

303124 AF1 61350 EST cluster (not in UnlGene) with exon hit 4.46 

323346 AL1 34932 Hs.143607 ESTs 4.4 

301383 AA913591 Hs.126480 ESTs 4.35 

324513 AW501678 Hs.164577 ESTs 428 

303480 AA331906 EST cluster (not in UnlGene) with exon hit 4.25 

323591 AA301270 EST cluster (not In UnlGene) 422 

313603 AW468119 EST cluster (not in UniGene) 42 

317863 AI733395 Hs.129124 ESTs 4.1 

312381 R42049 Hs.195473 ESTs *M 

317514 AW451570 Hs.126850 ESTs 4.03 

319750 AA621606 Hs.1 17956 ESTs 4.03 
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322520 T55958 EST cluster (not In UniGene) 

314754 AW026761 Hs.134374 ESTs 

316088 AI9S0652 Hs208973 ESTs 

318473 AI939339 Hs.146883 ESTs 

5 307848 AI364186 EST singleton (not in UniGene) with exon hit 

300730 AW449204 Hs257125 ESTs 

303034 W6Q843 Hs.31570 ESTs 

324668 AI679131 Hs.201424 ESTs 

324674 AA541323 Hs. 115831 ESTs 

10 300547 N53442 Hs.143443 ESTs 

316100 AW203986 Hs2130G3 ESTs 

314801 AA481027 Hs.127336 ESTs; Weakly similar to ORFYGR245c[S.cerevisiae] 

320856 D59945 EST duster (not in UniGene) 

313188 AJ0397Q2 Hs.179573 collagen; type I; alpha 2 

15 314187 AA804409 Hs.1 18920 ESTs 

311828 AA765470 Hs.122826 ESTs 

302358 D81150 EST cluster (not in UniGene) with exon hit 

311441 Z38720 Hs.151014 ESTs 

321914 AA011603 EST cluster (not in UniGene) 

20 332216 H95082 Hs. 102332 EST 

324771 AA631739 EST cluster (not In UniGene) 

323691 AA317561 EST cluster (not In UniGene) 

303525 AW516519 Hs.1 15130 ESTs 

309709 AW242630 EST singleton (not In UniGene) with exon hit 

25 300038 AFFX control: MuriL4 

316526 AI088192 Hs.135474 ESTs; Weakly similar to ATP-OEPENDENT RNA HEUCASE A [Hsaplens] 

313029 AA731520 Hs.170504 ESTs 

304356 AA196027 Hs.195188 gtyceraldehyde-3-phosphate dehydrogenase 

314610 A1948686 Hs.1 91 805 ESTs 

30 329815 CH.14_p2 gi|6624888 

314949 A1745387 Hs239124 ESTs 

300598 N53574 Hs.158932 ESTs 

329218 CH.XJisgl|5S53726 

315706 AW440742 Hs.155556 ESTs 

35 303751 AW503637 EST cluster (not in UniGene) with exon hit 

307783 A1347274 EST singleton (not in UniGene) with exon hft 

321414 AA324975 Hs.128993 ESTs; Weakly similar to KIAA0465 protein [H.sapiens] 

312187 AA700439 Hs.1 88490 ESTs 

334061 CH22..FGENES.327J4 

40 336036 CH22_FGENES.678_7 

321477 H67818 Hs222G59 ESTs 

315760 AW139383 HS245437 ESTs 

316733 AA811713 Hs.163222 ESTs 

300855 AW235248 Hs.79828 ESTs 

45 323611 AA304986 Hs.145704 ESTs 

314138 AA740616 EST duster (not in UniGene) 

316774 AA814859 EST duster (not In UniGene) 

308884 AI833131 Hs.179100 ESTs 

331317 AA258222 Hs.87757 ESTs 

50 317221 AI989538 Hs.191074 ESTs 

316386 AA749062 Hs.180285 ESTs 

321040 H26953 EST duster (not tn UniGene) 

308828 Ai824829 EST singleton (not in UniGene) with exon hH 

300778 AA236233 Hs.188716 ESTs 

55 316667 AW015940 Hs232234 ESTs 

324614 AW503101 EST duster (not in UniGene) 

316468 AW293046 Hs255158 ESTs 

300671 AI239706 Hs.189886 ESTs 

314301 AW297867 Hs.188181 ESTs 

60 312335 AW043620 Hs236993 ESTs 

322957 AA247755 EST duster (not In UniGene) 

316848 AA830053 Hs.126798 ESTs 

313473 AA009660 Hs251948 ESTs; Moderately similar to T07D3.7 [C.elegans] 

318518 T27119 EST duster (not in UniGene) 

65 313383 AI076370 Hs.134037 ESTs 

331389 AA458637 Hs.162207 ESTs 

304257 AA053294 EST singleton (not in UniGene) with exon hit 

309917 AW340014 EST singleton (not In UniGene) with exon hit 

319661 H08035 Hs21398 ESTs; Moderately similar to PUTATIVE GLUCOSAMINE-6-PHOSPHATE 
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321253 AI699484 
321193 AA149508 



ISOMERASE[H,sapiens] 
EST cluster (not In UniQene) 
Hs.103288 ESTs 

CH22.FGENES.28_4 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



300027 

M11507 
324330 AA884766 
320014 AA137114 
333916 

318885 Z43272 
318146 AI040125 
323348 AA233056 
3G5703 AA825148 
335862 

317672 AW205409 
323416 A1610397 
312652 AI419909 
324094 AA382603 
319761 R84237 
317013 AA864466 
317383 AA913887 
314659 AW277121 
312479 AI950844 
332808 

311824 AW293826 
321992 C06003 
316074 AW517542 
309839 AW296076 
312071 AA683529 
312684 AW294020 
332668 AA062971 
322139 H53744 
304168 H77679 
325602 

319885 R59096 
300611 N75450 
316854 AA831215 
318208 A1091458 
331623 R38715 
324616 AI823999 
304968 AA614308 
314912 AI431345 
300767 AW193466 
313463 AI057369 
320600 AA135565 
301180 AI308989 
324825 AA704457 
300336 AW292417 



317850 
339047 
324580 
321142 
319478 
300793 
313733 
326505 
314987 
303114 
318709 
312878 
329224 
328018 
323231 
312887 
315183 
300259 
313240 
316697 



Hs.170291 



Hs/150521 
Hs.191518 
Hs.21229 

Hs.127748 
Hs.159560 



N29974 

AA492588 

AI817833 

R06841 

A1248571 

AA836116 

AW015506 
AF090948 
H24244 
AI209108 



AA324437 

AW157377 

AW136134 

A1479011 

AT743261 

AW293174 



Hs.135646 
Hs.126511 
HS254881 
Hs.128738 

Hs.250610 
Hs.116456 
Hs.208382 

Hs.143119 
Hs.1 17721 
Hs.181161 



AFFX control: transferrtn receptor 
EST cluster (not In UniQene) 
ESTs 

CH22_FGENES.296_5 
EST cluster (not in UniQene) 
ESTs 
ESTs 

F-box protein Fbwib 
CH22J=GENES.629_7 
ESTs 
ESTs 
ESTs 

EST cluster (not in UniQene) 
EST duster (not In UniQene) 
ESTs 
ESTs 
ESTs 

ESTs; Weakly similar to non-lens beta gamma-crystaKIn Eke protein [H.sapiens] 
.CH22J=GENES.7J0 
ESTs 
ESTs 
ESTs 

EST singleton (not in UniGene) with exon hit 
ESTs 
ESTs 

ESTs; Weakly similar to INHIBITOR OF APOPTOSIS PROTEIN 1 [M.muscu!us] 
EST cluster (not in UniGene) 
EST singleton (not in UniQene) with exon hit 
CH.13_hsgi|5866994 
Hs.136698 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.159066 ESTs; Weakly simitar to predicted using Genefinder [C.elegans] 
Hs.134559 ESTs 

Hs.153529 Homo sapiens done 24540 mRNA sequence 
Hs.162000 ESTs 

EST singleton (not In UniGene) with exon hit 
Hs.161784 ESTs 
Hs.136525 ESTs 
Hs.122536 ESTs 
Hs£50739 ESTs 
Hs.156939 ESTS 

Hs_55738 ESTs; Moderately similar to gag [H^apiens] 

Hs_S5074 ESTs; Moderately similar to high-risk human papilloma viruses E6 



Hs.209584 
Hs.186837 

Hs.130730 

Hs.240763 
Hs.143946 



Hs.1 77230 
Hs.132910 
Hs.220277 
Hs.170783 
Hs.131860 
HS252627 



EST duster (not In UniGene) 
CH22_DA59H18.GENSCAN.2e-7 
EST duster (not in UniGene) 
ESTs 

EST duster (not In UniGene) 
ESTs 

EST duster (not in UniGene) 

CH.19jisgi|5867435 

ESTS 

EST duster (not in UniGene) with exon hit 
ESTs; Weakly simitar to /prediction 
ESTs 

CH.XJisgi|5868728 

CHX)6Jis #802482 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 



2.95 
2.93 
2.93 
2.92 

2.91 

2.B8 

2.68 

2.88 

2.87 

2.87 

2.85 

2.84 

2.83 

2.82 

2.81 

2.81 

2.81 

2.8 

2.8 

2.78 

2.78 

2.77 

2.75 

2.75 

2.73 

2.73 

2.73 

2.73 

2.72 

2.72 

2.72 

2.72 

2.71 

2.71 

2.71 

2.69 

2.68 

2.68 

2.68 , 

2.67 

2.67 

2.67 

2.65 

2.65 

2.65 

2.65 

2.64 

2.64 

2.64 

2.63 

2.62 

2.62 

2.61 

2,6 

2.6 

2.6 

2.59 

258 

257 

256 

256 

255 

2.55 

2.55 

254 

2.54 

253 
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313966 A1807551 Hs. 189061 ESTs 253 

331263 AA015718 ze31a12.s1 Soares retina N2b4HR Homo sapiens cDNA clone 

IMAGE36574 3', mRNA sequence . 2.51 

310683 AW055233 Hs. 160870 ESTs 2.5 

5 302566 AA085996 Hs.248572 Human PAC clone W404F18 from Xq23 2.5 

302697 AJ001408 EST cluster (not In UniGene) with exon hit 2.5 

308362 AI613519 EST singleton (not in UniGene) with exon hit 2.49 

322347 AF086538 EST duster (not In UniGene) 2.49 

316240 AA974253 Hs.120319 ESTs 2.49 

10 323208 AA203415 Hs. 136200 ESTs 2-48 

321643 W76005 Hs.32094 ESTs 2.48 

330723 AA243617 Hs.31082 ESTs; Highly similar to db83 [Rjiorveglcusj 2.48 

323455 AA256675 Hs-200438 ESTs; Weakly similar to atypical PKC specific binding protein [R-norveglcus] 247 

308383 AI624497 EST singleton (not in UniGene) with exon hit 2.47 

15 328744 CH.07J1S gi|586B290 247 

332344 W45574 Hs.252497 ESTs 2.47 

328121 CH.06_hsgii5868031 2.47 

321915 AI670955 Hs.200151 ESTs 2.46 

314954 AA521381 Hs.187726 ESTs 245 

20 302821 AA188868 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H.sapiens] 2.45 

329454 CH.Y_hsgi|5868887 2.45 

336605 CH22_FGENES420_4 2.45 

300664 AM44628 Hs.256809 ESTs 2.44 

323362 AL135067 Hs.1 17182 ESTs 2.44 

25 300024 M10098 AFFX control: 18Sribosomal RNA 2.44 

325026 AI671168 Hs.12285 ESTs 243 

324510 AI148353 Hs.120849 ESTs 2.43 

313389 AI765182 Hs.119903 ESTs 243 

301309 M78276 Hs.255917 ESTs 2.43 

30 313570 AA041455 Hs.209312 ESTs 243 

316504 AW135854 Hs.132458 ESTs 242 

319401 R01342 EST cluster (not in UniGene) * 2.42 

312827 AI744361 Hs.205591 ESTs; Weakly similar to zinc finger protein Png-1 [M,muscu!usl 2.42 

327871 CH.06JS gf(5888131 241 

35 337173 CH22_FGENES.565-3 241 

302948 AA465635 EST cluster (not in UniGene) with exon hit 241 

324303 AL1 18754 EST duster (not In UniGene) 24 

315527 AI791138 Hs.1 16768 ESTs 24 

315979 AA830515 Hs222917 ESTs 24 

40 331310 AA253351 Hs.44439 STAT Induced STAT inhibitor-4 24 

321095 AA017595 Hs.32844 ESTs 24 

308561 AI701559 EST singleton (not in UniGene) with exon hit 2.39 

313035 N36417 Hs.144928 ESTs 2.37 

322114 AA643791 Hs.191740 ESTs 2.37 

45 313671 W49823 Hs.145553 ESTs 2.37 

303211 AA099548 Hs.191436 ESTs; Highly similar todJ1118D244[Rsapiens] 2.37 

301256 AA932948 EST duster (not In UniGene) with exon hit 2.36 

. 338165 CH22_EM:AC005500.GENSCAN.2123 2.36 

324692 AA557952 EST duster (not In UniGene) 2.35 

50 318587 AA779704 Hs.168830 ESTs 2.35 

312378 R41582 Hs.109219 retinal degeneration B beta 2.35 

318625 T48448 Hs.193162 ESTs 2.35 

305181 AA663726 Hs.1 16922 EST 2.35 

300815 AA286678 EST duster (not in UniGene) with exon hit 2.34 

55 324063 AW292740 Hs.254815 ESTs 2.34 

315859 AA682305 Hs.1 33268 ESTs 2.33 

305092 AA642912 EST singleton (not In UniGene) wflh exon hit 2.33 

306598 AI000320 EST singleton (not In UniGene) with exon hit 2.33 

300307 AI651018 Hs246311 ESTs 2.33 

60 321348 Z49979 EST duster (not In UniGene) 2.33 

325112 AI903770 Hs.124344 ESTs 2.32 

336679 CH22 FGENES43-7 2,32 

321383 AJ002574 EST cluster (not in UniGene) 2.32 

337357 CH22_FGENES.730-6 2.31 

65 300680 AW468066 Hs.257712 ESTs; WeaWy similar to K1AA0986 protein [H.saplens] 2.31 

327120 CH.21 hsgi|6531970 2.31 

302761 AW250553 EST cluster (not In UniGene) with exon hit 2.3 

312132 AW75490 Hs.170577 ESTs 2.3 

315639 AA827652 EST duster (not in UniGene) 2.3 
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312189 T95594 Hs.187435 ESTs 2.3 

306537 AA991705 EST singleton (not In UniGene) with exon hit 2.3 

327061 CH21Jisgi|6531965 2.3 

315391 AA759098 Hs.192007 ESTs 2.3 

322384 A1968646 Hs.33862 ESTs 229 

323206 AA203339 Hs220750 ESTs 2.29 

318110 AI680915 Hs201379 ESTs 228 

335250 CH22.FGENES.516J1 228 

331696 236907 Hs.91662 KIAA0888 protein 228 

318327 AW294013 HS200942 ESTs 228 

324980 AA969121 Hs254298 ESTs 228 

319429 A1608881 Ks.11482 ESTs; Highly similar to junctional adhesion molecule [H .sapiens] 2.28 

310601 AI970543 Hs. 192605 ESTs 228 

318905 Z43395 EST cluster (not In UniGene) 228 

323442 AA252753 Hs.164039 ESTs 227 

304428 AA342250 Hs.99819 ubtquttln specific protease 16 227 

313352 AW292127 Hs.144758 ESTs 227 

316491 AA766025 Hs238794 EST 227 

317751 AI697668 Hs202241 ESTs 226 

314136 AA229781 Ks221962 ESTs 226 

306665 A1004614 Hs.1 30577 EST 226 

303946 AW474196 Hs221604 ESTs 225 • 

313435 AA769123 EST duster (not In UniGene) 225 

317679 AA968799 Hs.150289 ESTs 225 

322370 AA330095 EST cluster (not in UniGene) 225 

306620 AI000929 EST singleton (not in UniGene) with exon hit 224 

329109 CH.XJ1S 01)5868626 224 

311043 AI871209 Hs.177128 ESTs 224 

300228 AI458372 Hs. 158748 ESTs; Weakly similar to synapstn lb [M.muscutusJ 224 

307223 AI193698 Hs.184776 ribosomal protein L23a 224 

309023 AI888045 EST singleton (not In UniGene) with exon hit 223 

310749 AI493675 Hs.170332 ESTs 223 

316769 AI914939 Hs2121B4 ESTs 222 

320409 AA356195 EST cluster (not In UniGene) 221 

333149 CH22_FGENES.87_8 221 

324951 M86125 Hs.137487 ESTs 221 

321939 AI791617 Hs.145068 ESTs 22 

320594 AI863952 Hs.169436 arginyitransferase 1 22 

320722 R67430 Hs.172787 ESTs 22 

321781 D78667 EST cluster (not in UniGene) 22 

328903 CH.08_hsgi|5868514 22 

303889 T19204 EST cluster (not in UniGene) with exon hit 22 

325045 T08845 EST cluster (not In UniGene) 22 

312828 AI865455 Hs21 1818 ESTs; Moderately similar to Oil ALU SUBFAMILY J WARNING ENTRY !!!! [Rsapiens] 2.19 

335109 CH22JK3ENES.494J5 2.18 

330878 AA131471 Hs.71440 ESTs 2.18 

311289 AI971362 Hs231945 ESTs 2.18 

304608 AA513456 EST singleton (not In UniGene) with exon hit 2.18 

337393 CH22LFGENESJ47-4 2.18 

332812 CH22JFGENES.7J4 2.18 

327665 CH.04_hsgi|5867839 2.18 

314581 AW504859 Hs237849 ESTs 2.17 

326508 CH.19_hs gi|6682496 2.17 

301242 AW161535 Hs258803 ESTs 2.17 

312780 AI765651 Hs.172900 ESTs 2.17 

315954 AW276810 HS254859 ESTs 2.16 

311179 AI880843 Hs223333 ESTs 2.16 

315320 AI084162 Hs. 186895 ESTs 2.16 

313017 AI015203 Hs.1 18015 ESTs 2.16 

312430 AW139117 Hs.117494 ESTs 2.15 

300864 AA406539 Hs.1 90958 ESTs 2.15 

314753 AA463262 EST cluster (not in UniGene) 2.15 

322574 AF156548 EST cluster (not in UniGene) 2.15 

321409 C03864 EST cluster (not In UniGene) 2.15 

321205 AA002047 EST cluster (not In UniGene). 2.14 

320406 AA353895 Hs. 152933 HUS1 (S. pombe) checkpoint homolog 2.14 

337646 CH2a.EM:AC000097.GENSCAN.11-2 2.13 

303084 AF174008 EST cluster (not In UniGene) with exon hit 2.13 

312185 AA654772 Hs.186564 ESTs 2.13 
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306813 A1066544 
314465 AA602917 
316168 A1821782 
315990 AI800041 
320712 R66887 
318487 AI167877 
317462 AWO15206 
304384 AA235482 
314544 AA399018 
319881 T72744 
328078 

317354 AW090770 
308617 AI738720 
311568 AW439969 
313605 AI761788 
314289 AA848118 
332933 
325498 

313659 AW296067 
324596 AW149321 
324783 AA640770 
302696 AA347452 
313418 AW450674 
326920 
327574 

323207 AI052795 
303753 AW503733 
305235 AA670480 
316055 AA693880 
317194 AW445167 
319565 AW408683 
335146 

301475 AI678183 
312442 AA120970 
322502 R62925 
303693 AA290875 
310179 A1215643 
321121 W23285 
331330 AA282197 
306557 AA994530 
317865 AI298794 
318667 AI493742 
318042 AW294522 
323818 AW245528 
331286 AA137062 
311262 AI989942 
335601 

311351 AI682303 

312996 AA249018 

328190 

338030 

333940 

328227 

331481 N27448 
335288 

307513 AI274307 
323316 AL134620 
319479 R21945 
303482 AA502583 
327489 

323935 AW175841 
309575 AW168096 
337043 

312897 A1828174 
307681 AI370434 
328656 

314569 AA813784 
332783 W45302 
315259 AA701499 



EST singleton (not In UniGene) with exon hit 2.13 

Hs.156974 ESTs. 2.12 
Hs.220587 ESTs; Moderately similar to Hit ALU SUBFAMILY SC WARNING ENTRY ill! [H.saplens] 

Hs.190555 ESTs 2.11 

EST cluster (not In UnlQene) 2.1 1 

Hs.143716 ESTs 2.11 

Hs.178784 ESTs 2.11 

Hs.62954 ferritin; heavy polypeptide 1 2.11 

Hs.250835 ESTs 2.1 

EST cluster (not in UniGene) 2.1 

CH.06„hsgq5868008 2.1 

Hs.192271 ESTs 2.1 

EST singleton (not In UniGene) with exon hit 2.09 

Hs.218177 ESTs 2.09 

Hs.204674 ESTs 2.09 

Hs.221216 ESTs 2.08 

CH22_FGENES.38_7 2.08 

CH.12Jt8gi)5B66967 2.08 

Hs.124106 ESTs 2.08 

Hs.105411 ESTs 2.08 

EST duster (not in UniGene) 2.07 

EST duster (not In UniGene) with exon hit 2.07 

Hs.1 14696 ESTs 2.06 

CH21Jsgi|6456782 2.06 

CH.03 hsgi|5867818 2.06 

Hs.192201 ESTS 2.06 

Hs.170315 ESTs 2.05 

EST singleton (not in UniGene) with exon hit 2.05 

EST duster (not In UniGene) 2.05 

Hs.126036 ESTs 2.05 

Hs.32922 ESTs 2.05 

CH22_FGENES.499_2 2.05 

Hs.170917 prostaglandin E receptor3 (subtype EP3) 2.04 

Hs.143199 ESTs 2.04 

Hs£43665 ESTs 2.04 

HS.30120 ESTs 2.04 

Hs.171381 ESTs 2.03 

EST duster (not In UniGene) 2.03 

Hs.89002 ESTs; Highly similar to CGI-07 protein [H^apiens] 2.03 

EST singleton (not in UniGene) with exon hit 2.03 

Hs.129130 ESTs 2.03 

Hs.165210 ESTs 2.02 

Hs.149991 ESTs 2.02 

Hs.134754 ESTs 2.02 

Hs.103853 ESTs 2.01 

Hs.232150 ESTs 2.01 

CH22J=GENES.581_41 2.01 

Hs^01274 ESTs 2.01 

EST cluster (not in UniGene) 2.01 

CH.06Jisgi|5868077 2 

CH22.EMAC005500.GB4SCAN.148-16 2 

CH22.FGENES.301J5 2 

CH.06Jlsgi|5868105 " 2 

Hs.43944 EST 2 

CH2*_FGENES.527_1 2 

EST singleton (not in UniGene) with exon hit 2 

EST duster (not in UniGene) 2 

HS256153 ESTs 2 

Hs.197271 ESTs 2 

CH.02_hsgl|60Q4459 L99 

Hs.192183 ESTs 1*99 

Hs.195188 glyceraldehyde-3-phosphate dehydrogenase 1.99 

CH22 W FGENES.439-19 1-98 

Hs.227049 ESTs 1 -98 

EST singleton (not in UniGene) with exon hH 1 .98 

CH.07Jisgi|6004473 1-93 
Hs.123001 ESTs 1 -88 
Hs.87889 helicase-mol ]*8 
Hs.148115 ESTS 1*8 
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313171 N67879 
318060 AI241421 
332256 N66393 
312110 A1962180 



W00545 

AA868267 

H15474 

AA862973 

AI373163 

AW090537 

AW028820 

AI820675 

AW373446 



320389 
314065 
323086 
323919 
310750 
309435 
300129 
320130 
323787 
338112 
313625 
325240 
331833 
332252 



Hs.157695 ESTs 
Hs.132236 ESTs 
Hs.102754 ESTs 
Hs.226803 ESTs 

CH22_FGENES.629J 
Hs.171785 ESTs 
Hs.85524 ESTs 

Hs.12214 Homo sapiens clone 23716 mRNA sequence 
Hs.220704 ESTs 
Hs.170333 ESTS 

EST singleton (not in UniGene) with exon hit 
EST duster (not In UniGene) with exon hit 
ESTs 



1.97 
1.97 
1.97 
1.97 
1.97 
1.97 
1.98 
1.96 
1.96 
1.96 
1.96 
1.96 
1.95 



Hs.203804 

Hs.169885 ESTs; Weakly similar to cONA EST EMBLT02216 comes from this gene [C.etegans] 1.95 



AW468402 

AA412102 
N63682 



Hs.254020 
Hs.250911 



300279 AW237425 Hs253817 



Hs.198800 
Hs.113011 



321609 H86021 
324183 AA402453 
336276 
334913 
325417 

318489 AW043590 
318455 AI148763 
306890 AI092235 
315073 AW452948 
321289 884687 
308521 AI689808 
306382 AA968967 
331320 AA262999 
324279 AA501412 
309577 AW168753 
327014 

303488 AWO25860 
306561 AA995223 
330694 AA019806 
313083 M50545 
327752 

318674 AA295490 
301267 AW297762 
332092 AA608787 
323509 AL036947 
321452 AA317554 
311483 AI765013 
300976 AI246374 
323715 AA322155 
313800 AW296132 
332029 AA489697 
304013 AW518573 
322019 AA354549 
334150 

310094 AW450967 
316218 AW207642 
324774 A1Q31771 
326507 

314570 AA405696 
336268 

315278 AI9B5544 
325824 

316277 AA737780 
323181 AA418583 
301438 AA961643 
307050 A1147341 
306830 A1075803 



HS225023 



Hs.257631 
Hs.226306 



Hs.42788 
H&191688 



Hs.129559 
Hs.108447 
Hs.159200 



HS255690 
HS.1 12590 



Hs.209128 
Hs.165861 

Hs.166674 
Hs.145053 
Hs.156110 
Hs.41181 



CH22_EM^C005500.GENSCAN.185-24 
ESTs 

CH.10.hsgi|5866848 
interieukin 13 receptor; alpha 1 

za21f9.s1 Scares total liver spleen 1NFLS Homo sapiens cONA clone 

IMAGE293225 3\ mRNA sequence 

ESTs 

CH.17jisgil5B67245 

ESTs; Weakly slmnar to hMmTRAlb [H^apiens] 
ESTs 

CH2£J=GENES.762_5 
CH22_FGENES.456._3 
CH.12tJis gi|5866925 
ESTs 

EST cluster (not In UniGene) 

EST singleton (not in UniGene) with exon hit 

ESTs 

ESTs 

EST singleton (not in UniGene) with exon hit 
EST singleton (not in UniGene) with exon hit 
ESTs 

ESTs; WeaWy similar to Pro-PoMUTPase polyprotein [M.musculus] 
EST singleton (not in UniGene) with exon hit 
CR21Jisgl[5867664 
EST cluster (not in UniGene) with exon hit 
EST 



ESTs 
CH.G5JIS #15867949 
EST duster (not in UniGene) 
ESTs 
ESTs 

EST duster (not in UniGene) 
EST duster (not in UniGene) 
ESTs 
ESTs 

EST duster (not In UniGene) 

ESTs 

ESTs 

Immunoglobulin kappa variable 1 D-8 

Homo sapiens mRNA; cDNA DKF2p727C191 (from done DKFZp727C191) 
CH22_FGENES.339J 



Hs.235240 ESTs 
Hs.174021 ESTs 
Hs.132586 ESTs 

CH.18_hsgi)5867435 

EST cluster (not in UniGene) 

CH22_FGENES.758__2 
H$. 116429 ESTs 

CH.15Jisgi|5867048 
Hs.213392 ESTs 
Hs.143621 ESTs 
Hs.127716 ESTs 
Hs.146734 EST 

EST singleton (not In UniGene) with exon hit 
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1.95 
1.95 
1.95 
1.95 

1.95 

1.95 

1.95 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.94 

1.93 

1.93 

1.93 

1.93 

1.93 

1.93 

1.93 

1.92 

1,92 

1.92 

1.92 

1.92 

1J91 

1.91 

1J91 

1.91 

151 

1.91 

1.91 

1.91 

1.91 

1.91 

1.91 

1.9 

1.9 

1.9 

1.9 

1.9 

1.9 

1.9 

1.9 

1.9 

1.9 

15 

1.89 

159 

1.89 



WO 02/30268 



302420 AL049925 Hs.225984 DKFZP547GQ910 protein 

320127 H72615 Hs.17268 ESTs 

337736 CH22..EMAC000097.GENSCAN.100-2 

331319 AA262755 Hs.194264 ESTs 

310767 AI377505 Hs,158835 ESTs 

314880 AI732169 Hs.105429 ESTs 

312639 AI004377 Hs.200360 ESTs 

309674 AW205604 Hs.168034 ESTs; Weakly similar to III! ALU SUBFAMILY SP WARNING ENTRY 111! [H. 

314621 AI627478 Hs.187670 ESTs 

319495 AI972146 Hs. 192756 ESTs 

313472 AA007374 EST cluster (not in UniGene) 

302705 U09060 EST cluster (not in UniGene) with exon hit 

329511 CH.10_p2gl[3933514 

317140 AI699412 Hs 20 1925 ESTs 

302598 AI815985 Hs,1 29683 ubiquitirvconjugating enzyme E2D 1 (homologous to yeast UBC4/5) 

301 153 AA725670 Hs.120485 ESTs; Weakly simitar to serine/threonine kinase with SH3 domain; leucine 

zipper domain and proline rich domain [H.sapiens] 

332222 N28271 Hs.176618 ESTs 

330703 AA055475 Hs.104143 clathrin; light polypeptide (Lea) 

318470 AI159863 Hs.143713 ESTs 

314014 AW291847 Hs.121715 ESTs; Weakly similar to HP protein [H.saplens] 

300370 AI827817 EST cluster (not in UniGene) with exon hit 

312329 R84768 Ks.13399 Homo sapiens clone 25032 mRNA sequence 

325587 CH.12Jisgi|6682462 

310237 AI884313 Hs.158906 ESTs 

318872 R13085 EST cluster (not in UniGene) 

303431 AA317915 EST cluster (not m UniGene) with exon hit 

338427 CH22_EM:AC005500.GENSCAN.349-1 

300452 AI352293 Hs.191098 ESTs 

321279 H85330 Hs.146060 ESTs 

301690 F05865 Hs.249180 ubiquitin-coniugating enzyme E2E 2 (homologous to yeast UBC4/5) 

307932 AJ230822 EST singleton (not in UniGene) with exon hit 

318292 AI679966 Hs.150603 ESTs 

310254 AI239811 Hs.157491 ESTs 

311790 AW016437 Hs.233462 ESTs 

314248 AA278347 Hs.126078 ESTs 

335586 CH22_FGENES.581J5 

339209 CH22.FF1 13D1 1 .GENSCAN.64 

307954 AI419692 EST singleton (not in UniGene) with exon hit 

302549 AF055136 Hs.248162 tectorin alpha 

321829 H87213 Hs.158092 ESTs 

301239 AA807558 EST cluster (not In UniGene) wRh exon hit 

332434 N75542 Hs.75356 transcription factor 4 

327192 Ca01_hsgil5887445 

310214 AI220072 Hs.165893 ESTs 

320516 R33857 Hs.161479 ESTs; Weakly similar to E-SELECTtN PRECURSOR [H.sapiens] 

324231 W60827 EST cluster (not In UniGene) 

336616 CH22_FGENES.613J5 

328799 CH.07_hsgt|5868316 

324661 AW504161 EST cluster (not in UniGene) 

313190 AA766707 Hs.153039 ESTs 

301979 L28168 Hs.121495 potassium voltage-gated channel; Isk-related family; member 1 

302099 AL021397 Hs.1 37576 ribosomal protein L34 pseudogene 1 

320187 T99949 EST cluster (not in UniGene) 

320791 R78808 Hs33961 ESTs; Weakly similar to UII ALU CLASS A WARNING ENTRY III! [H.saplens] 

305733 AA829535 Hs.84298 CD74 antigen (invariant potypept of MHC; class II antigen-associated) 

308280 AI569349 Hs.180920 ribosomal protein S9 

321533 W78877 Hs.40111 ESTs 

312946 AI915122 Hs£04087 ESTs; Weakly similar to F33011.9b [Celagans] 

319474 H90265 Hs.100638 ESTs 

329519 CH.10j)2gp83510 

324685 AA220982 EST cluster (not In UniGene) 

320697 N62937 Hs.139181 ESTs 

329246 CHXJisgil5868732 

332000 AA481271 Hs. 193945 ESTs 

310811 AI420990 Hs.161303 ESTs 

325866 CH.16Jwgi|5867076 

322064 Z78343 EST cluster (not in UniGene) 

333712 CH22.FGENES.251J 
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313457 AA576052 
321591 H85687 



AI656320 

AA081924 

AI27S011 

H20560 

AI3411B0 

R17531 

AA730673 

AI400310 

AW292760 

AA649011 
AI623739 
AI248285 
D81015 



NM_000565 

AI475949 

AW205705 

Z43011 
AA845630 

H54178 

H20826 

AA333666 

AI264671 

A1540166 

AI663782 

AL038841 

AI286182 

AW451733 

AA001811 



311080 
329522 
322869 
300175 
330976 
300208 
319635 
313454 
303093 
309815 
326506 
319845 
300290 
312180 
313058 
330120 
328412 
302345 
308100 
311386 
330282 
318856 
312486 
325450 
321206 
330977 
303487 
310398 
313230 
317747 
303381 
336123 
300185 
316002 
319850 
329941 



322934 
325902 

303530 
300930 
331909 
321553 
301618 
319592 
318511 
327183 
313516 
318644 
321632 
324657 
300437 
319775 
314775 
337460 
309849 
301471 
312739 
319995 
326495 
337497 
322633 
332177 
326930 
316893 



A!493054 

W01813 

AI274851 

A1025527 

AA437300 

K92449 

T52760 

AA627356 

T26528 

AA029058 

A1752482 

AA419617 

AW451142 

AW449374 

AA504429 

AI149880 

AW297444 
AA995014 
A1318426 
H15355 



AA004534 
F10812 

AA837332 



Hs.193223 ESTs 
Hs.1 17927 ESTs 

CH.05_p2gi|6671884 
Hs.197711 ESTs 

CR10_p2gi|3983507 
Hs.211417 ESTs 
Hs.204877 ESTs 
Hs.244624 ESTs 

Hs.196115 EST$WeaklyslmBarloFIBRlLLIN1 PRECURSOR [H^aplens] 

EST duster (not In UniGene) 
Hs.188634 ESTs 
Hs.148958 ESTs 

EST singleton (not In UniGene) with exon hit 

CH.19_hsgl|5867435 
Hs.187902 ESTs 
Hs.186387 ESTs 
Hs.1 18348 ESTs 
HS.125382 ESTs 

CH.19j)2gi|6671B64 

CH.07Jisgi|5668405 

EST duster (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 
Hs.207514 ESTs 

CH.05j>2gi|6671910 
Hs.21169 ESTs 
Hs.117904 ESTs 

CH.1£_hsgi|5B66941 
HS226469 ESTs 
HS.31783 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.164166 ESTs 
Hs.129563 ESTs 
Hs.128245 ESTs 

Hs.163313 ESTs; Weakly similar to ilil ALU SUBFAMILY SB WARNING ENTRY !!!! 

CH2e.FGENES.70L8 
Hs.208484 ESTs 
Hs.119824 ESTs 
Hs.83722 ESTs 

CH.16_p2gi|6165189 

CH.07Jisgi|5868375 
Hs.158968 ESTs 

CH.16JwtgiI5867101 
Hs.12109 WD40 protein Ciaol 
HS558744 ESTs 
HS222097 ESTs 
Hs.178210 ESTs 
Hs.1 16406 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.163315 ESTs 

Hs.227175 ESTs; Weakly similar to !ii! ALU SUBFAMILY SQ WARNING ENTRY Ml 

CH.01.hs #867442 
Hs.135145 ESTs 

EST duster (not in UniGene) 

EST duster (not in UniGene) 
Hs.255628 ESTs 
Hs.257149 ESTs 

Hs.621 1 methyl-CpG binding domain protein 1 
Hs.188809 ESTs 

CH22./GENES780-5 

EST singleton (not in UniGene) with exon hit 
Hs.129544 ESTs; Weakly simitar to ORF YLL027w [S.cerevisiae] 
Hs.155925 ESTs 
Hs.60887 ESTs 

CH.19JS #867423 

CH22LFGENES.801-4 
Hs.153981 ESTs 
Hs.101433 ESTs 

CH21_hsgi[6456782 

EST duster (not in UniGene) 
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324826 AA704806 Hs. 143842 ESTs 1.75 

311269 A1656924 Hs. 174257 ESTs 1.75 

309375 AW075342 EST singleton (not In UnlQene) with exon hit 1.75 

314171 AI821895 Hs. 193461 ESTs 1.75 

311684 AI99Q741 Hs.252809 ESTs 1.75 

334387 CH22L.FGENES.380J 1 .75 

312195 AI300101 Hs.252222 ESTs 1.75 

315707 AI418G55 Hs.161160 ESTs 1.74 

324349 AW501470 EST cluster (not In UnlGene) 1.74 

300724 AI762929 Hs.206134 ESTs; Weakly similar to similar to reverse transcriptase [Cetegans] 1.74 

309906 AW339340 EST singleton (not In UniQene) with exon hit 1.74 

303714 AW501336 EST cluster (not in UnlGene) with exon hit 1.74 

318704 Z24981 EST cluster (not In UnlGene) 1.74 

303027 AF111178 EST cluster (not in UnlGene) wilh exon hit 1.74 

322601 W92924 EST cluster (not In UnlGene) 1.74 

319382 H93199 Hs.33665 ESTs 1.74 

315858 AA737345 EST duster (not in UniGene) 1.74 

332243 N55484 Hs.220540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCXEAR 

TRANSLOCATOR [Rsaplens] 1.74 

330951 H02566 Hs.191268 Homo sapiens mRNA; cDNA DKFZp434N174 (from clone DKFZp434N174) 1.74 

324044 AL045752 Hs.211519 ESTs 1.73 

320630 AA199847 EST cluster (not in UniGene) 1.73 

327288 CR01J1S gl[5867481 1 .73 

314986 AI201367 Hs.142860 ESTs 1.73 

319078 H17255 Hs.144515 ESTs 1.73 

326278 CH.17_hsgiI5867269 1.73 

302552 H49792 EST cluster (not in UnlGene) with exon hit 1.73 

322322 AF086431 EST cluster (not In UnlGene) 1.73 

327075 CR21_hsgi|6531965 1.73 

317392 AI797588 Hs.145459 ESTs 1-73 

300610 AI076890 Hs. 186949 ESTs 1.73 

315978 AA830893 Hs.119769 ESTs 1.73 

323903 AA773580 Hs.193598 ESTs 1.73 

330803 AA004699 Hs.150580 putative translation initiation factor . 1.73 

309845 AW296802 Hs.255580 EST 1.73 

314963 AI689617 Hs.200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 1.73 

315315 A1984592 Hs.15088 ESTs 1.73 

300378 AA663560 Hs.235873 ESTs; Weakly similar to K1 1C4.2 [Cetegans] 1.73 

316141 AW303457 EST cluster (not in UniGene) 1.72 

319826 T71739 Hs.75442 albumin 1.72 

312961 AI033922 Hs.122517 ESTs 1.72 

334379 CH22.FGENES.379J1 1.72 

305854 AA862733 EST singleton (not in UniGene) with exon hft 1.72 

313031 N34927 Hs.186566 ESTs 1.72 

329728 CH.14j)2gi[6065785 1-72 

312090 N57692 Hs.1 18064 ESTs 1.72 

323341 AL134875 Hs.192386 ESTs 1.72 

302077 AA310580 Hs.132898 Homo sapiens chromosome 1 1; BAG CIT-HSP-31 1e8 (BC269730) 

containing the hFEN1 gene 1 .71 

310766 AI971438 Hs.158824 ESTs 1.71 

311450 A1809985 Hs.203340 ESTs 1.71 

311792 AW238064 Hs.253909 ESTs * 1.71 

321500 H71999 EST cluster (not In UniGene) 1.71 

311948 T78791 Hs.241569 ESTs; Moderately sm!r to III! ALU SUBFAMILY SQ WARNING ENTRY !!!! [H.sapiens] 1.71 

302270 R56151 EST cluster (not in UniGene) with exon hit 1.71 

329089 CHXJ»sgi|5868614 1.71 

322331 AF086487 EST cluster (not In UniGene) 1.71 

318235 AI080361 Hs.1$4217 ESTs 1.71 

304561 AA489792 EST singleton (not In UniGene) with exon hit 1.71 

312681 A1028149 Hs.193124 pyruvate dehydrogenase kinase; Isoenzyme 3 1.71 

310250 AI478629 Hs.158465 ESTs , 1.71 

338178 CH22_EM^C005500.GENSCANi!19^ 1.71 

338910 CH22.DJ32I10.GENSCAN.11-2 1.71 

321225 AL080073 H&251414 Homo sapiens mRNA; cDNA DKFZp564B1462 (from clone DKFZD564B1462) 1.7 

322289 AA534550 Hs.539 ribosomal protein S29 1.7 

319802 AI701489 H&2Q2501 ESTs 1.7 

314022 AW452420 Hs.248678 ESTs 1.7 

314937 AA515602 Hs.152330 ESTs 1.7 
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313421 
309763 
322092 
315603 
325031 
327157 
314309 
320361 
324721 
328624 
303344 
328960 
315702 
302385 



AA761322 
AA262785 
AW339515 
AW270182 
AF085833 
AA764768 
T08597 

AI741461 

H87220 

AW402302 



HS220538 
Hs.163700 

Hs.121158 



Hs.161904 
Hs.146406 
Hs.43616 



AA255977 Hs.250646 



309506 
330417 
315296 



AA657501 

W224172 

R14537 

AW137700 

D84424 

AA876905 

AA354146 
AL079289 
AI927068 
A1472124 
AI273815 

AA195405 
R05385 
Z42977 
AW244073 
AW137772 

AL080280 
T58960 
AA249037 
AA424754 
AI797592 
AA081820 



Hs.146315 
Hs204096 



ESTS 

EST singleton (not in UniGene) with exon hit 
ESTs 

EST singleton (not In UniGene) with exon hit 

EST cluster (not In UniGene) 

ESTs 

EST cluster (not In UniGene) 

CH.01_hsgip866841 

ESTs 

nltrilase 1 

ESTs 

CH.07_hsg[l5868246 

ESTs; Highly similar to ubiquftirt-conjugatlng enzyme [M.musculus] 
CH.Q8Jisgi|6458775 
ESTs 



320303 
302887 
310695 
307512 
338506 
331722 
301431 
318853 
323032 
317538 
325780 
321739 
319808 
313443 
331366 
316443 
322878 
330320 
329081 
334026 
317791 

331148 
325452 
315106 
326014 
307130 
300943 
319402 
310889 



323371 AL135118 
335568 

320654 AW263086 



EST duster (not in UniGene) 
EST singleton (not in UniGene) with exon hit 
hyaluronan synthase 1 
ESTs 

CH.07Jlsgip868485 
EST cluster (not in UniGene) 

Homo sapiens mRN A full length insert cONA done EUROIMAGE 35971 
ESTs; Weakly similar to R10D12.12 [Oelegans] 
ESTs 
keratin 8 

CH22_EMAC005500.GENSCAN.390-1O 
Hs.110347 Homo sapiens mRNAfor alpha integrin binding protein 80; partial 
EST cluster (not in UniGene) with exon hit 
ESTs 



Hs.57697 
Hs.125286 



Hs.137154 
Hs.1 10853 
Hs.157757 
Hs.242463 



330002 
315343 
334487 
312169 
309668 
309518 
307965 
316787 
300835 
338763 
303327 
313231 



Hs.21062 
Hs.145946 ESTS 
Hs.185980 ESTs 

CH.14_hsgH6381953 
EST cluster (not in UniGene) 
EST cluster (not in UniGene) 
EST cluster (not in UniGene) 
Hs.43149 ESTs 
Hs.207407 ESTs 

EST duster (not in UniGene) 
CH.08_p2gl|5932415 
CHJLhsgi|5868602 
CH22J=GENES.318_3 
AI801500 Hs.128457 ESTs 
AF086106 EST duster (not in UniGene) 

R73816 Hs.17385 ESTs 

CH.iaJw #866941 
AW452184 Hs.232100 ESTs 

CH.16_hsgt|5887160 
AI185234 EST singleton (not in UniGene) with exon hit 

AA524545 HS224630 ESTs 
W21298 EST duster (not in UniGene) 

AI457946 Hs.170437 ESTs; Weakly similar to hyperpolarization-activated; cydlc 
nucieotide-gated channel 2 [H.sapiens] 
EST duster (not in UniGene) 
CH22LFGENES.581J 

Hs.118112 ESTs 

CH22J)A59H18.GENSCAN.3-1 
CH.16j)2gil6623963 
AW205477 Hs.179891 ESTs 

CH22_FGENES.395_9 
A1064824 Hs.193385 ESTs 
AW204480 HS253414 EST 
AW148928 Hs.248895 EST 

A1421641 EST singleton (not in UniGene) with exon hit 

AW369770 Hs.130351 ESTs 
AA401858 HS-224843 ESTs 

CH22.EMAC005500.GENSCAN517-16 

AA232729 Hs.154302 ESTs 
AW139993 Hs.163682 ESTs 
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334073 

319801 T77138 
326530 

301126 A1802877 
314043 AA827082 
304387 AA236027 
322932 AA099732 
337272 

332694 AA262768 
318996 Z44266 
315336 AW342028 
313329 AW293704 
318088 AW295409 
313835 AI538438 
320035 AA376974 
309372 AW074330 
324157 AW402236 
323929 AA354940 
302490 AA885502 
333942 
327469 

301918 AA476777 
315664 AI744068 
304405 AA282572 
310624 AI341594 
319250 F11623 
310608 AJ962234 
317348 AI348076 
306513 AA989230 
320807 AA086110 
303710 AI269069 
328291 

304236 W93278 
317683 AI791700 
311960 AW440133 
312834 AI028309 
325326 

313663 AI953261 
327526 

300429 AW449679 
305169 AA663131 
316621 AI021996 



A1744130 

AL031709 

AI307229 

AA498019 

A1183686 

N49476 

R87650 



315763 AW515270 
323571 AA984133 
312240 R28628 
304569 AA490934 
313179 AI076101 



318035 
300492 
316532 
332048 
307113 
319127 
331155 



CH2£_FGENES.327J8 
Hs.8765 RNA hellcase-related protein 

CH.19_hsgl|5867441 
Hs210843 ESTs;WeaWy similar to dJ1039K5^[H.sapiens] 

EST cluster (not fn UniGene) 

EST singleton (not In UniGene) with exon hit 

EST cluster (not in UniGene) 

CH22_FGENES.660-1 
Hs.243901 KIAA1067 protein 

EST cluster (not In UniGene) 
HS256112 ESTs 
Hs.122658 ESTS 
HS.137945 ESTs 
Hs.159087 ESTs 



317276 
312572 
311932 
302103 
308413 
310077 
337780 
327766 
308352 
324539 
303232 
337884 



AI823847 

AA350125 

AW451654 

AA452310 

A1636253 

A1620617 



A1610791 
AI378032 
AA437414 



1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.65 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 
1.64 



Hs.130720 ESTs; Weakly similar to CELLULAR NUCLEIC ACID BINDING PROTEIN [H.saplens] 1.64 
EST singleton (not In UniGene) with exon hit 1 .63 

EST cluster (not in UniGene) 1.63 
Hs.145958 ESTs 1.63 
Hs.187032 ESTs 1-63 
CH22_FGENES.301_8 163 
CH.02JisgI|5867772 1.63 
EST cluster (not in UniGene) with exon hit 1 .63 

Hs.160712 ESTs 1*3 
. EST singleton (not in UniGene) with exon hit 1 .63 

Hs.157522 ESTs; Moderately similar to env protein [H^aplens] 1 .63 

EST cluster (not m UniGene) 1 *63 

Hs.196102 ESTs 1*3 
Hs.831 34iydroxymethyl^ethytglutar^ 1.63 

EST singleton (not In UniGene) with exon hit ' ~* 

Hs.188536 Homo sapiens clone 24838 mRNA sequence 
Hs.250852 ESTs; Highly similar to ubiquitin hydrolyzing enzyme I [Rsapiens] 
CH.07_hsgt|5868363 

EST singleton (not in UniGene) with exon hit 
HS.127893 ESTs 
Hs.189690 ESTs 
Hs.1 14246 ESTs 

CH.11_hsgi|5866875 
Hs.169813 ESTs 

CKOajis g]]6381832 
Hs.156739 ESTs; Highly similar to XG GLYCOPROTEIN PRECURSOR [Ksaplens] 

EST singleton (not in UniGene) wan exon rut 
Hs.122138 ESTS 

CH.14jJ2gi[6272129 
Hs.131201 ESTs 

multiple UniGene matches 
Hs.184304 ESTs 
Hs^01591 ESTs 

EST singleton (not In UniGene) with exon hit 
EST cluster (not In UniGene) 
Hs.33439 ESTs; Weakly simitar to 1111 ALU SUBFAMILY J WARNING ENTRY III! [H.saplens] 1.61 
CH22_EMAC0Q5500.GENSCAN.246-9 1*1 



1.63 
1.63 
1.63 
1.63 
1.63 
1.63 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 
1.62 



Hs.1 18342 ESTs 
Hs.153260 c-CbHnteracting protein 
Hs.203669 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.131704 ESTs 

CH.20jisgt|6552462 
HS.129986 ESTs 
Hs.187499 ESTs 
Hs.257482 ESTs 

Hs.26090 ESTs; Weakly similar to T20B12.1 [C.eiegans] 
Hs.196511 EST 
Hs.148565 ESTs 

CH22 EMAC000097.GENSCAN.121-2 

CH.Q5jisgi|5867982 

EST singleton (not In UniGene) with exon hit 
Hs.1 25892 ESTs 

EST cluster (not in UniGene) with exon hit 
CH22JMAC005500.GENSCAN.54-2 
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303620 AA397546 Hs.119151 ESTs 151 

303481 AA336839 EST duster (not In UniGene) with exon hit 1.61 

314481 AA548589 Hs.105846 ESTs 1.81 

300327 AI903894 Hs.245893 ESTs 1.6 

323473 AA262442 EST duster (not in UniQene) 1.6 

326154 CH.17Jhs gi|5887170 1.6 

331920 AA446885 Hs.99087 - ESTs; Moderately simitar to ZINC FINGER PROTEIN 141 [H.saptens] 1.6 

323827 AW406878 EST duster (not in UniQene) 1.6 

322452 W56710 EST duster (not In UniQene) 1.6 

310597 AI739071 Hs.158515 ESTs 1.6 

307871 AI368665 EST singleton (not in UniQene) with exon hit 1.6 

322215 AF088005 EST duster (not In UniQene) 1.6 

318420 AI139857 Hs.143837 ESTs 1.8 

332217 H98987 Hs.102383 EST 1.6 

324937 M79230 Hs.192398 ESTs 1.6 

320543 AF052176 Hs.158529 Homo sapiens dona 24457 mRNA sequence 1.6 

300674 AW467388 EST duster (not in UniQene) with exon hit 1.6 

315193 AI241331 Hs.131765 ESTs 1.6 

319713 R24204 EST duster (not in UniGene) 1.6 

301210 AE3799B2 Hs.158944 ESTs 1.6 

309365 AW072861 EST singleton (not In UniQene) with exon hit 1.6 

321403 AW451454 Hs£47568 adenylate kinase 3 1.6 

321908 AA376936 Hs.20998 ESTs 1.6 

303349 AA382661 EST duster (not in UniGene) with exon hit 1.6 

324338 AL138357 Hs.247514 ESTs 1.6 

310599 AW300144 EST duster (not in UniQene) 1.6 

333193 CH22_FGENES.98J5 1.6 

336433 CH22_FGENES.B25_.12 1.6 

312097 AI352096 Hs.157169 ESTs 1.6 

311445 AW204237 Hs.192703 ESTs; Weakly similar to 1111 ALU SUBFAMILY J WARNING ENTRY II!! [H^aplens] 159 

317736 AI361722 Hs.192410 ESTs 1.59 

308147 AI498991 EST singleton (not in UniGene) with exon hit 1.59 

313489 AA017492 Hs.135655 ESTs 1.59 

316289 AA902488 Hs. 122952 ESTs 139 

326983 CH.21_hs gi|6867657 1 £9 

314781 AW205298 Hs.202372 ESTs 1.59 

328397 CH.07_hs gi|5868397 1 .59 

331970 AA461084 Hs.187677 ESTs 1.59 

321744 N91419 Hs.12028 ESTs 159 

310509 A1292181 Hs. 150036 ESTs 159 

315921 AI147545 Hs.1 14172 ESTs 159 

322049 AI928242 Hs.144383 ESTs 159 

301161 AA731518 EST duster (not in UniGene) with exon hit 159 

300548 AIQ26836 Hs.114689 ESTs 159 

319142 F07366 EST duster (not In UniGene) 159 

313526 AW152263 Hs.249243 ESTs 159 

305937 AA883238 EST singleton (not In UniGene) with exon hit 158 

330123 CH.19_p2gil6671869 158 

327819 CH.05JKJ gi|5867968 1 58 

318250 A1478814 Hs.134603 ESTs 158 

306760 AI034094 Hs.169476 tubulin; alpha; ubiquitous 1.58 

322358 AA220235 Hs.246838 ESTs 158 

317866 AJ69Q269 Hs.201345 ESTs 158 

320725 AA703318 Hs,120967 ESTs 158 

311332 AW292247 Hs.255052 ESTs 1.58 

334893 CH22_FGENES.452_7 1,58 

318730 AA398215 EST duster (not in UniGene) 158 

315889 AW271639 Hs.221744 ESTs 158 

303702 AW500748 Hs.224961 ESTs; Weakly simitar to 73 kOA subunft of cleavage and polyadenylation 

specificity factor [H.sapiens] 1 57 

315086 AI492660 Hs. 170935 ESTs 157 

332514 AA156499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 157 

335549 CH22J=GENES576J0 1.67 

329532 CH.10_p2 gi|3983505 1.57 

323140 AA180467 EST duster (not in UniGene) 157 

313166 AI801098 Hs.151500 ESTs 157 

337896 CH22_EMAC005500.GENSCAN.56-3 157 

330658 AA319514 Hs.211093 ESTs 157 

324585 AI823969 Hs.132678 ESTs 157 
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300785 


AA682913 


304921 


AA603092 


324605 


AW5Q2851 


324473 


AW501 163 


300566 


H86709 


314165 


AA761265 


302668 


AA1 57392 


314034 


AI299137 


325369 




331849 


A A j 4 ^ma 

AA417078 


320536 


AA331732 


303347 


AA258033 


315769 


AA744875 


317031 


AA973297 


300203 


AI827065 


304037 


T26438 


322613 


AW1 60507 


317987 


AW1 38174 


322313 


AF086386 


323992 


AW411383 


325303 




312701 


AJ 457663 


304767 


AA582678 


305849 


AA861571 


314557 


AA401367 


316507 


AI381515 


315023 


AA533505 


314920 


AA513406 


323097 


Z44354 


325043 


W27919 


307892 


AI376088 


324573 


AA491600 


313092 


AI923673 


324696 


AA641092 


303019 


AF098363 


317158 


AI459140 


309536 


AW151933 


301568 


AI146423 



HsJ>55735 ESTs 
Hs-203231 EST 

CH.19_hsgi|5867307 
Hs54888 ESTs 

EST cluster (not in UnlGene) 

EST singleton (not in UniGene) with exon hit 

CH22_EM:AC005500.QENSCAN.174-1 
Hs,195602 ESTs 
Hs.170480 ESTs 

CH.07_hs #004473 
Hs.148559 ESTs 

CH.21_h$gi|6531965 
HS.204Q79 ESTs 

EST cluster (not in UnlGene) 
Hs.204579 ESTs 
Hs.127301 ESTs 

EST cluster (not in UnlGene) with exon hit 
Hs.1 99297 Homo sapiens GNAS1 gene encoding NESP55 

EST cluster (not in UnlGene) 
Hs.133132 ESTs 

EST cluster (not in UniGene) with exon hit 

EST duster (not in UnlGene) 

CH.07_hsgi|58683B8 

CH.Y_hsgi|5888874 
HS257767 EST 

CH22^EM:AC005500.GENSCAN£28-1 
Hs.127320 ESTs; Weakly similar to K1AA0346 [H^aplens] 

EST cluster (not in UniGene) 

CH22.FGENES.369J7 
Hs.233374 ESTs 
Hs.162017 EST 

Hs.247179 ESTs; Weakly similar to KIAA0319(Rsapiens] 

EST singleton (not in UnlGene) with exon hit 
Hs.249978 ESTs 

EST cluster (not in UniGene) 
Hs.21371 son of seveniess (Drosophila) homolog 1 
Hs.221281 ESTs 

EST cluster (not in UniGene) with exon hit 
Hs.154214 ESTs 

CH.12Lhsgip866921 
Hs.193767 ESTs 
Hs.137224 ESTs 

EST duster (not in UnlGene) with exon hit 
Hs.189413 ESTs 
Hs.126101 ESTs 
Hs.224877 ESTs 

EST singleton (not In UnlGene) with exon hit 

EST duster (not In UnlGene) 
Hs.130651 ESTs 

EST duster (not in UniGene) 
Hs.169688 ESTs 

CH.11Jisgi]5866908 
Hs.128127 ESTs 

EST singleton (not in UniGene) with exon hit 

EST singleton (not in UniGene) with exon hit 
Hs.128647 ESTs 
Hs.158381 ESTs 
Hs.185844 ESTs 
Hs.152307 ESTs 

Ks.180950 guanine nucleotide binding protein (G protein); q polypeptide 
Hs.32944 lnc^tolpolyphosphate^phosphatase;typel;107kD 
Hs.158759 EST 
Hs.161942 ESTs 
Hs212827 ESTs 
Hs.257339 ESTs 

EST duster (not in UniGene) with exon hit 
Hs.129109 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.146709 ESTs 
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315674 AA651923 Hs.191850 ESTs 153 

321861 N79341 EST cluster (not In UniGene) 153 

310890 A! 184510 Hs.143728 ESTs 153 

330038 CH.17_p2gi|6042048 153 

316907 AA843868 Hs.190567 ESTs 153 

312299 AA972712 Hs.174818 ESTs 153 

331128 R51361 Hs.23423 ESTs 153 

305177 AA663591 EST singleton (not In UniGene) with exon hit 153 

337685 CH2S.EM AC000097.GENSCAN.77-1 1 53 

335290 CH2?_FGENES527_3 1 53 

308896 AI858667 EST singleton (not In UniGene) with exon hit 153 

307944 AI418246 EST singleton (not In UniGene) with exon hit 153 

300867 AW340374 Hs.121033 neural precursor cell expressed; developmentally down-regulated 1 153 

335320 CH22_FGENES534_7 153 

329841 CH.14_p2 gi|6672062 153 

317916 AI565071 Hs. 159983 ESTs 153 

332901 CH22.FGENES.36_2 1 53 

305413 AA724659 EST singleton (not in UniGene) with exon hit 153 

316707 AI016387 Hs.184406 ESTs 153 

313693 AW469180 Hs. 170651 ESTs 1.53 

316101 AA922236 Hs.221037 ESTs 153 

320796 AF038966 Hs.184543 secretory carrier membrane protein 1 153 

307451 AI248815 EST singleton (not in UniGene) with exon hit 153 

323648 AI679968 Hs.152060 ESTs 153 

331482 N27515 Hs.40296 ESTs 153 

318059 AI023175 Hs.167022 ESTs 153 

325958 CH.16_.hs gi|5867142 153 

315736 AA664265 HS230213 ESTs 153 

314740 AW015667 Hs.1 19427 ESTs 1.52 

314117 AA224368 Hs.185164 ESTs 152 

301646 AA313954 EST cluster (not In UniGene) with exon hit 152 

338752 CH22_EMAC005500.GENSCAN513-10 152 

309314 AW009312 EST singleton (not in UniGene) with exon hit 152 

301445 AI208364 Hs.128233 ESTs; Weakly similar to REGULATOR OF CHROMOSOME 

CONDENSATION [M.saplens] 152 

308501 AI685263 Hs.201150 EST 1.52 

312330 AA635305 Hs.121574 ESTs 152 

318040 AJ018150 Hs.148781 ESTs 1.52 

336205 CH2£_FGENESJ19J0 152 

325701 CH.14jisgi|5867028 152 

315009 AW189460 Hs.208358 ESTs 152 
303121 AW407585 Hs.27769 ESTs; Weakly similar to mCAC [M.musculus] 152 
309271 A1986221 EST singleton (not in UniGene) with exon hit 152 
328385 CH.07_hS gi|5868395 1 52 
307700 AI318545 EST singleton (not in UniGene) with exon hit 152 
314591 AW103292 H&245328 ESTs 152 
304484 AA432067 Hs.258373 ESTs 152 
304382 AA232873 EST singleton (not In UniGene) with exon hit 1.52 
304232 W52674 EST singleton (not In UniGene) with exon hit 152 
309853 AW288169 Hs57553 tousteoMike kinase 2 152 
312504 AW2G7346 Hs.143202 ESTs 152 
313134 N63406 Hs258697 ESTs 152 
330391 AFO 15950 Hs. 115256 telomerase reverse transcriptase 152 
314342 AI873046 HS.25877S ESTs 151 
305977 AA887293 EST singleton (not in UniGene) with exon hit 151 
301165 N85789 Hs.224155 ESTs; Weakly similar to PTERIN-4-ALPHA-CARBtNOLAMlNE 

DEHYDRATASE [H^apiens] 151 

300613 AI932294 Hs.249604 ESTs; Weakly simitar to B-CELL LYMPHOMA 6 PROTEIN [H^aplens] 1 51 

324124 AI554212 Hs.185664 ESTs; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 [H.sapiens] 151 

308037 AI458207 Hs.174181 ESTs 151 

323909 AL043148 Hs.186257 ESTs 151 

315464 AW139500 Hs.1 16135 ESTs 151 

306700 A1022056 EST singleton (not in UniGene) with exon hit 151 

337976 CH2a_EMAC005500.GENSCAN.107-1 151 

306855 AI083982 EST singleton (not in UniGene) with exon hit 151 

311045 AI569399 Hs. 174746 ESTs 151 

315010 AA531082 Hs£40049 ESTs 151 
310205 AW025248 Hs.202445 ESTs 151 
310759 AW135924 H&224883 ESTs 151 
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310954 AW44S044 Hs.171298 ESTs 151 

312019 T77046 Hs.188760 ESTs 151 

334773 CH2*_FGENES.430_5 1-51 

332043 AA49Q831 Hs.125056 ESTs 1.51 

322850 AA296219 EST cluster (not In UniGene) 151 

337920 CH22_EfAACOO5500.GENSCAN.67-3 151 

328993 CH.09Jtsgl|5868536 151 

309245 AI972447 EST singleton (not in UniGene) with exon hit 1J51 

312172 AI222168 Hs.191168 ESTs 151 

304039 T47349 EST singleton (not in UniGene) with exon hit 1.5 

301329 AI149653 Hs.190498 ESTs 1.5 

313376 AI949246 Hs.200381 ESTs 15 

324248 AW504918 EST cluster (not in UniGene) 15 

308771 A1809301 EST singleton (not In UniGene) with exon hit 15 

334935 CH2*_FGENES.464_3 1 5 

319764 AA019827 EST cluster (not in UniGene) 15 

318519 T27135 EST cluster (not In UniGene) 15 

332807 CH22.FGENES.7_9 1 5 

322310 AF085376 EST duster (not In UniGene) 15 

324557 AA489166 Hs.156933 ESTs 15 

332118 AA609585 Hs.162689 EST 15 

319539 R09027 EST cluster (not In UniGene) 15 

313149 AW291092 Hs.201058 ESTs 15 

329722 CH.14_p2 gi[6065785 1 5 

323514 AA861209 EST cluster (not In UniGene) 15 

308078 AI472621 EST singleton (not In UniGene) with exon hit 15 

337965 CH22_EM_\C005500.GENSCAN.100-10 15 

335905 CH22J=GENES535_13 15 
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TABLE 14A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



322064 234514 1 BE261397 278343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA3740B7 AA584776 
321409 197898 1 N71838 AA282003 T54072 AA761419 H92966 Al 831 371 A1095435 AI690247 R99331 AW9541 10 AA975590 AA346128 
H94196 C03864 

322092 46678.1 AF085833R69689AW341677AA923375BE327566AW630415 R69601 AW615339 

321452 212379J AW962489 H64300 AA329527 

313603 199797 1 AA284333 AW4681 19 AA284334 AA810992 

320856 36098 1 AB040928 T94673 AI289313 A1536039 Z44366 BE141499 D60116 D61488 D59945 AA419503 R28090 R72986 K03255 

AI1891 12 A1912312 AW51 1018 A(401349 AW470144 C14624 AI335797 240300 AI014456 D60269 D601 15 T16722 AI370673 
D60270 

322139 46806J H53744 AF075088 H53797 

321500 552826J BE004271 AI248023 AJ022157 H71999 

313733 441212J AA766346 AA809877AA836116 AW469598 AW977404 

322215 47002.1 AF088005 N51816N51731 

322235 47070.1 AF0861 06 AI1 93589 AW665594 N71795 AA722627 AW665373 AI300251 

321632 286374 1 AW812795 AM19617 H87827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

313833 120893 1 AA766825 AA81 1 180 AA085906 AI762946 AW977820 

322310 47376J AF086376 W77804 W72689AA837735 

322313 47386J AF086386 W77947 W72708 

322322 47434 1 AFD86431 AA886756A1557237 

322331 47467 1 AF086467 W81444 W81445 

322345 47537 1 W95298 AF086529 AJ912190 AW294159 AI458747 W94782 

322347 47545 1 AF086538 W95969 AI63191 1 W95835 

322370 187612 1 AA330095 W251 12 AA249401 

321739 43998 T AL08Q280 T73124 H02689 AL080281 

321781 1511778 1 078667 D78871 C18258 

314570 280469J AA904776 AA405696 AA405962 

300129 635249.1 AW028820 AI219068 

322452 497108J AI147202 W56755 W56710 

321861 1651920.1 N79341 N99082 N47551 

323140 159551.1 AA180467 AA449184 AA464831 AA505048 

322520 38916 1 T55958 T57205 AF147346 

321914 85114 1 AA011603 N58604 N58811 

322571 22297 1 NMJH6102 AF156271 AA781888 AW152318 AW770403 AA909463 AM82996 AA758672 

322574 39412L1 AF156548 AA639797A1675267AJ825497AI823355 

314753 311451 1 AA463262 AA463615 AW160405 AW407583 

300370 3910J AW136181 AA581939 AK001221 AA694538AA424043AI016272 AA098960AA884473AI356180BE391633AA437086 

AI277866 AA098827 AA992680 BE172624 AM24101 AA320776 AW962987 N77431 AW858960 AW858897 T85649 
AA357743 AI827817 AI905672 

322601 577912 1 A1082395 W92924 BE048524 AW0053Q2 AI084474 AI369330 AI827710 AW135506 AW298694 

322613 34330 T AW160507 NM.013367 AF191338 AA384939 AI445790 AA730309 BE397003 BE267753 AI979163 N50386 AW583671 

AW583608 BE074466 BE074479 BE074471 AW976283 AA604393 AW162122 W73648 AI823475 N75888 W73713 
AW470099 AW513236 AW025055 AW613115 AI923379 W58081 AW664525 AW196785 AI143619 AI565152 AA025406 
AA505846 AI685494 AA829964 N59156 N59163 R15442 AA826919 AI610221 A1200120 AA603279 AW150822 AI189513 
AI807122 AI016368 A1335868 AW583389 AI193892 AI956157 AI628879 AW591589 AW583446 AI955406 AW148396 
AI340255 AI867942 AA748525 AA876991 Z38516 AI8740Q2 AI869474 N63100 AA429094 AA082443 

316055 409389 1 AW105663 AA693880 AW517398 Ai768507 BE220851 AW978538 AA831489 

323316 981458*1 BE219300 BE327455 AL134620 R36741 R17996 

300492 25768 J AL031709 AI249061 AA907658 AW20444 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



316141 423880J AW303457 AA972713 AA724265 

323371 117336J? N45114N51465BE087338AI083551 AL135118 BE395609 

307700 30923J1 BE260938 BE254670 8E294951 BE564979 AW4Q5364 AA069258 AA1 29837 AI559667BE281405AW410850BE041153 

A1254811 AW301340M813335AW301411 A1609469 A1811607 AI611816 AI377623 AI335509 A1G13544 BE0431 65 AI371663 
AI340452 AI812066 AW072890 A1254558 AI349884 AI370095 AI613383 AI61 1946 A1613353 AI307414 AI318229 A! 5 12685 
AW305327 AW268924 A1370063 AI349292 BE049068 AI369098 AW274098 AI344845 AW075187 AI053401 AI345220 
BE138515 A1613386 A1583302 AW301955 AJ349661 AI307432 AI054168 A1223913 AI612081 AI348942 AI334539 AI309366 
AI370098 AI252360 AW086316 AW26891 1 AW073482 AI379802 At224284 AI053681 AI334538 A1309369 A1309888 AI310023 
AI492709 AI335418 AI053999 AI366989 AW073478 AI247058 AI249584 AI305875 AI308585 AW071272 AI271487 AI340719 
AI366995 AI223673 AW271066 AE611G38 AW071296 AI270796 AI254385 AI251393 AI252562 AW268236 AI254858 
AW071317 AI309102 AI609897 AW268971 AI583267 AI792484 AW075168 BE138443 A1254126 A1309822 AI310872 
AI61 1953 AI251054 AW276658 AI335405 AW075Q39 AI311768 AI612028 AW271895 AI612005 A1312240 AW271082 
AI371642 AI334879 AI3101 94 AI310772 AI345419 AI334675 AI223914 AI284707 AI284813 AI349140 AI254853 AI313094 
AJ310170 AI309499 A1312476 A1376484 AI335467 AI340802 A1309815 AI310168 AI61 1448 A1345824 BE327775 AI318545 
F17185AW614950 

3083S2 792518_1 AW998989 AI613519 

307783 697809 J A1347274 AW844024 

301161 427238 J AA731518 M765714 

324094 270098 1 BE395109 AW663898 AW237041 AI492154 BE046906 AI651285 A1983290 AW002590 AI201040 F32424 AA992272 
AW271836 

309023 4737 1 AF 180681 NM_015313 AA229509 AA225792 AA216413 AI838045 BE005205 AB002380 T55518 BE276097 AW380669 

BE142836 AW370976 AA479384 R96425 AI680999 AA595138 H54582 AI022709 T55440 AJ 04 1769 AA861 144 AW392026 
AA479287 AA824634 AI638448 H54691 R96382 AA770352 A1640467 AW293491 AA779138 R28298 AA970562 C15590 
R84455 AA020769 AL036394 H60566 BE548861 AA301207 AW959414 AI284253 AA043173 W52429 BE544571 R24852 
Z42603 F1 3120 R24340 R24326 T75305 H701 10 N56255 AA334210 F1 1 453 AW947285 H80345 AA298992 AW380931 
AI267175 Z45421 AW380981 W88113 AA663590AA1 67577 BE566760BE1 69166 AA449904 AA459205 N31 126 W03564 
N31208 AW993277 N44765 AW605275 D61449 W68572 AA258190 D80496 AW992964 U46277 H04097 AA370360 
AW957211 AA159775 AI631243 H83367 H21671 D61077 AW392712 N21 1 12 H98522 N45298 N83629 AI393509 AW022043 
AA744886 A1580482 AA723286 AI422244 A1423984 D62804 A1088349 AA587890 AI144172 N33275 BE074397 H03399 
D62578 AIQ56639 A1829918 AA579584 AI089460 AI350124 W68573 AI530823 H98897 AI570468 H83715 W861 14 AA923123 
D57446 AA043174 AW337721 A1266551 A1140017 AW022356 D79855 079650 D79393 D60495 AA788666 AA693443 
AW516977 W60139 AJ628156 AW473223 AI608892 AA159670 AW440366 AI421529 T50751 A1174374 AA912234 AA724248 
AW780400 AA907218 H80514 D57452 AA863419 AA552618 D29614 R44556 T16452 R44935 Z41 132 D29188 H69692 
AI250176 A1078860 AA370359 AW183108 H74200 AA258183 F10723 C00323 R86148 AA860570 AW130073 AL078946 
AA410327 AA532614 AA234500 Al 151507 AA410288 AW969639 AA483232 AI383200 AA236540 AI607672 H73441 

323473 193878.1 AA262442 AA768862 AA262443 

315639 392767 1 AA827650 AA827652 AW629526 BE044585 AW974451 AA761439 AA648505 AA765803 
322878 117013J AA081820AA082191 AA078811 
301239 457668 1 AA807558 AA8271 17 AW629567 

301256 16720 1 NM_0 16603 AF251038 Al 124624 AA776579 AW298470 A1304868 AW082724 A1348442 BE218336 N20641 Al 01 8013 

AW858832 AW978157 AA815187 AA932948 AF157316 AI444958 W00848 W02935 AI434833 N26335 AA428681 AW371059 
AI651612 AW134937 AW98891 1 AA488815 AL157523 W48766 AW936954 AW936941 AW579205 AW936866 AW936889 
N74541 AW936953 AW578421 AW604352 AW367088 AW849258 AW849453 AW371 606 AI554921 W49785 H99814 
AA805957 AA904606 AW206696 BE169229 AA333951 AA190704 AW936944 AA463219 AA430306 AW805704 N48503 
BE222307 AI638612 BE550045 AI805304 AI690987 AA776841 H 1 2690 AW1 83731 AI380760 A1636261 AA812641 
AW592656 Al 68 6 132 AA843424 K99220 AW084996 AW128879 AJ800871 AA610135 AA191524 AJ150076 AI474530 
AA748461 N29013 AA746372 N59606 

N75450 AA877636 AW1 37945 W05248 AA514763 AW972399 AI758397 AW195051 
AW402931 BE393099 
AL036947 T93676TB5475 

AA641735 AA281881 AA861209 AA934756 AA835887 AA641795 AA748822 AW295703 
AW467388AA826954 

AF16871 1 AA099732 BE019157 AI38Q212 BE298159 AA249097 AA3051 12 AW962349 AW962353 AW401801 BE292961 
AI439469 AA442919 AI630537 AA724473 AI814288 AW966815 AI376871 A1860202 AI683132 AA099733 AW627633 
AI754022 BE206347 AW183349 AI378222 8E178926 A1473282 W52944 AW752469 AW966817 
AA301270 AA301379 AA301366 

R85652 AA1 14024 AA298219 AA375304 AW983796 AW885952 AW020969 AA1 14025 AI804930 BE350971 AI765355 
AW317067 AW974763 H85930 AW172600 AI310231 AW612019 D62908 D62864 AA652738 AI674617 AI494064 AW138666 
AI147620 AM47629 AW61 1793 AI668922 AI971005 AI864742 AA174171 

AK001701 AA134337 AA356202 8E163251 AW875175 AW875181 AW875177 BE163389 AKQ00741 AA247755 AA120819 
AW868040 AA3091 18 AW962348 AA471267 AW996843 AK001452 BE005344 BE617899 AA186588 M120820 AW36331 1 
AA648105 N71529 BE168417 AW673900 AI858160 AA134338 AA659697 N22162 AI335437 AJ311237 AI343171 AI336661 
AW268074 AW274348 AA935005 AW576295 AW262628 AW593153 AA730055 AA662650 AA782687 AW894855 AI933533 
AW193002AW899448 AW880142 AW812670AA085664 AA334191 BE178085 BE1 80553 AA389680 AA984772 AA442527 
W26560 BE384359 AA847210 A W3 049 31 A1669606 AA085613 AW 197240 AJ632828 AA581648 AW129348 AJ 01 7643 
AW089030 D20893 AI382955 AI557148 AW499979 
324231 975669 J W60827 AL079968 AL047234 
324248 977901J AW504918 N55410 AL1 18584 AW839266 

323691 221757J AA317561 AI783000 AW235111 AI793178 AA767397 A1263113AA719462 
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300611 3371 93_1 
324157 247225J1 
323509 987739J 
323514 197787J 
300674 466093J 
322932 39838J 



323591 209807J 
322950 10774J 



322957 29014J 
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315858 406384 J 
301431 569736.1 
324303 233842.1 
324330 300543J 
300815 41537.2 
324349 1154015.1 
323715 225129J 
309314 23273.-3 
323758 229624.1 
309375 127.1 



325031 266373.2 
325045 1534945.1 
324473 38785.1 
323827 235506.1 
302270 1734192.1 
301618 10987.5 
301646 42154J 



323923 249295 J 

324580 328264.1 

316774 463723.1 

309577 6483.6 

302345 29533.1 

302358 1064753.1 

324614 215437J 

324661 385257.1 

324685 41003.1 



324692 351987.1 
316893 473541.1 
303027 21796.1 

324715 290035.2 

324771 385085J 

324783 389615.1 

303114 37417.1 

303124 21112.1 



302552 82290.1 
301918 316229.1 
303232 20474.1 



302696 33570.1 

302697 43219.1 
309917 57485.2 
303347 192210J 
303349 193138.1 
310599 690880.1 



AA737345 AA682288 AF789378 

R05385AI061251 

AL1 18754 AA333202 H38001 

AA884766 AW974271 AA592975 AA447312 

BE152396 BE152395 AA287515 BE001834 AA266678 AW406477 

AW501470AW502931 AW499500 

AA322155 AA326396 AA326538 

AW009312 

AA833858 AW978090 AA327679 AA810438 

AF286598 AW075342 ABQ28994 AL043713 AW378914 AA340650 N571 66 AW956914 R17961 AA336481 BE393734 
AW977867 AW294638 AA927857 AA961627 AW303969 AW894416 AA8121 19 AA912758 AA424355 AA490582 W30941 
AA476693 AA131029 AA127777 AL043714 AA496984 T51 1 17 AA127722 AA594012 AI492876 N76483 AW1 19061 BE464926 
AW303419 AI972370 AI768172 AW26550 AI435432 AI379516 AA778421 AI276089 AA424521 N59361 AA723153 AA723176 
AI867487 AA090677 A1827221 AB51027 W02732 AI810729 AA142848 AI0821 10 N59379 N29744 AI283747 Ah 48665 
AW779845 AI382967 F34319 AI369934 AI282438 AW183449 AA863467 AA813469 AI092645 AJ870701 AA8631 19 
T65475 R07576 T17017 F08143 Z43546 
T08845Z43538 F06691 

BE560824 BE513941 AW238907 AA580852 AW501 176 BE241846 AW501 163 AW751433 AW501340 BE241715 AI910774 
AW406878 AW966560 AW966151 AW966496 AA336174 AA335376 AA335537 
R56151 W91936 
T52761 T52760 

AJ277641 AI630669 AI804370Z41939 AW751251 AA299456 Z44739 AW860471 Zg)158AW1 05391 H56997W84688 
AA491201 W84636 AA706815 AI131055 AA483636 AI005075 AW340034 AI332372 AW1 18195 AI338932 AJ 191 958 
AA693932 AI169982 AI193225 AA884163 AA594562 W37747 AA249754 AA746131 AI916540 AI832188 AW946555 
AA833838 Z40564 AA861563 F01447 AA887937 AI933559 AW973250 AA566018 AA313954 
AA354146 AI164230 AA643525 
AA492588 AA492498 AA492571 
AA814859 AA814857 AI582623 
AW902251 AW168753 

X12830 NNL000565 AW503691 X58298 S72848 AA193347 AW503481 AW177946 AW178192 AW178188 AA285233 
AA410577 AA193465 AW177939 AW365459 BE221693 
AW207734 D60164 081 150 D81078 061356 AW996804 
AW503101 AA309184 N56323R70998 
AW504161 AW503601 AW505509 

AF226667 AA207032 AA1008O4 AA121287 AA488316 AI608218 AW419048 A191 1097 AW132123 AA502311 AW089948 
AA100952 AI075431 AW083432 AI990554 BE466Q29 F28643 AF086422 W79581 AW439007 F37179 W79780 AW439035 
AA731381 AW750380 AA25101 2 AW589846 AA730238 AA329792 AW087255 AA220982 AA082469 AA877260 AA232380 
BE298910 

AA557952 AA677593 AA618150 
AW979189 AA837332 AAB56946 AA876935 

AF1 1 1 178 NMJ005708 AF105287 AW590040 AI979280 AA001 322 BE146329 AA702430 AA702429 AA694221 AI206348 
AI206285 AW770197 AA923032 AI379586 AA701165 AW594643 AA001909 AW002368 

AI739168 AA426249 AI199636 AW505198 AW977291 AA824583 AA883419 AA724079 AI015524 AI377728 AW293682 
A1928140 AA731438 AI092404 AI085630 AA731340 
AA631739 AA768584 AW134477 
AA640770 AI6831 12 AA913009 
AF090948 A1064898 AI11 1 182 

AB018257 BE148640 AA081832 AK001915 AF150217 AF161350 A1219174 AW074664 D60040 AA346065 H28750 
AW151783 BE613360 BE612628 BE502031 AW183790 AA992580 AA505815 A1310432 A1678015 AW592679 AA879181 
AA806708 AI7441 10 H24681 C16064 062900 AI285033 AA346064 AI865123 AW467798 BE221231 AL120676 N89877 
A1928370 AI358387 AA748486 AV647478 AV647460 AA312313 A1279340 AW505099 
AA005122 H49792 
AA476777 T86049 

AA437414 AA131479 AA086182 AB037775 AW161063 AW514393 AA332331 AW136197 BE1 50789 AA425533 AA249605 
N88308 AI016201 BE004662 AA291027 R57587 AA424277 AA476391 W07532 T97036 AA218898 AW162629 R57770 
W01278 W90204 W90156 AL119197 R84513 AA2801O3 AA334994 AW965504 AA460868 AA447470 AW138594 W38898 
W90028 AI078353 W90078 AA699696 N35523 AA704225 AA035059 AW134892 AA1 15140 AJ 142854 H90084 AA826342 
AA460694 N46339 AA425344 N56953 AA035569 AI781083 AI658696 AI52481B AI338965 AW069249 AW299871 BE464061 
AI189720 AW340682 AI423380 AI275122 H17532 N80735 AA826343 A1039694 BE328398 All 92947 AW271286 AI623122 
AI922902 AW293087 N22141 AA730657 AW316610N26473 F06663Z43810H14783R59761 H11540AI265915 AI681773 
AI091748 BE220636 AW841861 AI702181 A1468447 AA907544 A1273941 AW244034 R37769 AA446663 T96929 BE0458B4 
AA476341 H89994 H29043 AW051211 N49522 AA306977 

AK000738 AA347452 AW981713 H70832 AT750643 AA362887 AW955588 W44974 AA279599 AW298762 AA452666 

AA443355 AI337273 AA446931 AI752977 AA661554 W42674 A1292172 R41163 AA621381 AI244157 

AJ001409AJ001410 

AW340014 AW866993 AV651649 

AA258033AA459485 

AA382661 AW958642 AA259088 

AW300144 AI338491 AI796381 BE220076 
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303388 869232 1 AL039604 AL039497 

302761 45074.T AW250553 L07876 236843 R30693 Al 180097 AW965317 
318455 606341 1 AI14B763 A1903763 AI903753 AJ903762 A1903800A1903801 

317850 363835 1 AI681545 M951714 AI570397AW873588 AA836396 AI359986 AW99790AA773477AI951615T07547AW304709AF114041 

BE176629 Z44580 T30422 T32690 AW953065 H10602 
303431 32082.1 NM_000539 AA019013 AA019367 AA056154 H38735 AA057003 AA021051 H38102 AA015774 AA059291 AA019439 H84843 

H83375 AA019914 AA01728B R84449 W26519 H38258 AA018736 H84147 AA018577 AA059353 U49742 H38767 AA318341 
AA317553 H86648 H91989 AA317398 AA317378 W29024 W23034 T27877 AW950O59 AA017185 R84262 AA057177 
H89941 AA019904 H84662 AA015775 AA019368 AA020976 H37900 C20733 H38682 H65197 AA018578 AA017252 
AA019440 AA059059 H38651 H84148 AA018560 W25754C20752 AA317915 AW952115 AA317369 AA019845 R85402 
AA019492 AA017196 AA056093 AA056094 AA058836 AA056155 W25957 W23027 AA056159 W23043 W21890 W28951 
AA317978W28459 AA3 17265 
N49476Z45911 R21061 
AA331906AA332484 

AK001952 AA336839 AW249271 BE247287 AF182002 BE613472 AW962673 AA332235 AW849937 AW849814 H49893 
AA477148 AW968944 AF182003AW007897BE246145 W76100AI480141 AW410205AA609339AI209111 AW000979 
AA330280 AW961554 W72865 H49894 AA514317 AA620407 AA504522 AW472833 AA716609 AW129282 AA347351 
AA628378 AW5S9860 AI636696 AA464632 AA464533 AW874189 AA757076 AA479654 AW517910 AW292357 AW872638 
AW262288 Ai910668 AW513749 AW238771 AA215797 BE387073 

BE143533 AW850432 AK000042 AA333666 AA385314 AW966616 AW793068 AW793414 AA361 103 AW390841 AA040095 
AW385058 AW789162 A13831 15 A1990745 AI653703 BE503693 AW150758 A] 94991 9 AW190450 AW512348 AI625970 
AW501057 N52954 AI281 378 AI4O1710 AI648409 AW002659 AI687639 AI093943 R33960 AA040062 AI926267 AI240425 
AI520911AKJ93428 R52943 

303488 36085 1 AI040372 AB040915 W40569 BE158910 BE158914 D63226 AW025860 AW583088 AA334307 AA210942 AW753212 

AW805322 AA362635 BE15891 1 AW891225 AW994882 AA805451 R28541 AA229347 N48266 AI377788 R28682 R36122 
AA811941 AI240742AI632001 T99965 W01976 AW891205 AW891177 T97433 C15571 AA346850 AA504293 W07500 
A1694503 AA489216 AA327725 AW959917 AA694146 N685U AI076265 AW016246 T077B3 AA642400 AA716133 AA805332 
R00312 AA705021 AW498605 AW891723 AW891906 AA808025 N29039 N74897 W60393 AA810184 AI627460 AW057516 
AA807436 AA760968 AI359295 N78642 N20662 AA83O300 W81705 AA832258 AW891718 AI811786 AW515523 Z41735 
AA449978 AW891714 AI684539 AW891B98 AW071701 AJ890916 AI924994 AI039743 AA888524 AA244214 AI015736 
AI270105AI865077 

F30712 F35665 AW26388B AI904014 AB04018 AA336927 AA336502 
H08370 Z46168 F07366 AA193168 AA193138 

AK000290 AI476034 AA465309 BE148761 AW303607 AW958665 AW469635 AI819365 AI243857 AW469326 AA157110 
AA278626 AA496257 AA306656 F29732 AA831859 AA312210 AA564476 AA579065 AA769522 AA740388 AI205635 
AA491643 AA810400 AA417708 AI567332 AA157392 N53817 AA374229 
R68545 T271 19 R25687 AW750672 
H13364 T27135 R61679 AA746905 
H77679 

AB038995 NM_016530 AK0O1 111 AA465635 AW968716 U66624 AA885459 AA703019 A1040266 AI018689 AI692886 
AI125372 A1376796 AI192040 N58161 AL 133607 AW503673 AW505479 AA362265 AJ404671 
F11623H17552AA347728 

BE311816 AK000916 AW868037 AW868039 AF228527 AI752482 AW868041 AA077049 AI201537 W55873 AA206019 
AA077918 AW968729 AI97B828 AW139620 A1093053 AW204025 AI418805 AA598926 AA586345 AA045669 BE314455 
AA045668 

W01 166 AW998900 BE184300 Z44887 T34535 R51495 AW886575 AA295490 AA295162 AA2951 63 AW937125 T56951 
BE386106 W52674 

AW500106 BE241915 AW503971 NM.016542 AB040057 AA313812 AK000556 W16504 A1822088 AA259107 AA191319 
BE085957 AA309584 BE122687 AW952435 T84469 BE038194 BE088132 AA328562 BE092674 AA2631Q2 T39634 
AW992380 R79391 R24392 H03060 AW675066 A1299952 AW02O325 D25953 N75199 AA361425 AW612302 AW236333 
AW673897 AW953686 N22323 AA649168 A1377099 H03061 AI660072 AW276405 AA809779 AI303430 AW297484 
AW510384 AA814816 AA371522 063035 AA953567 R79392 R24282 AA876831 AW297542 AI699023 AA992652 AI041436 
At631602 AW589676 Z28684 Z24981 
Z32887 BE349923 AA398215 AA399231 
AW501336AW501337 
AA236027 BE003275 

AA185509 BE394661 AV660757 AA489161 BE165972 AW503705 AA262785 AF123320 Z78357 NM.014171 AF161488 
AA248971 BE568575 AA461410 AA165108 AIB37731 H75454 AA372934 AW339334 BE568754 BE564697 BE567299 
AI681606 BE537269 AW197204 AA290890 AI189393 AW292463 AW470227 F27399 AW61 1942 BE566888 AW301701 
A1675761 A1628429 AA164711 AI797753 A1656878 AI91 2690 A1675277 AI695099 AI094095AW014158BE091059AI201 748 
AW236961 AI038003 AI083606 AA401608 AI079405 AI073516 AI655537 AA401475 A1814532 AI079862 AI093789 A1422084 
A121 6476 AJ392760 AA926998 AA781782 Z25198 A1086377 AI185511 Al 185539 Z28843 A1223792 A1379563 AA706253 
AI433788AI921885 H75455AW025269AE224100AI083611 AI225057 AW1 96334 AI572254 AA761628 AJ472801 AA283784 
303751 468554J AA830149 AW978407 M85983 AW5Q3637 

319401 1323199J W00973 N56457 AW992226 T84921 R01342 

319402 1003489J R86913 R68901 H25352 R01370 H43764 AW044451 W21298 
318807 1536467J F08434Z42573 H28810 

319478 765461J AI524124 R06841 R06842 
318872 1534581J Z43108 F06295 R13085 



303494 238389J 
319142 164820.1 
302668 12593.1 



318518 1205335.1 

318519 434741.1 
304168 72494 -10 
302948 21445.1 

319250 244351.1 

318644 17700.1 



318674 204968.1 
304232 20640 J 
303685 8088.1 



318704 799152J 
318730 275116.1 
303714 1155758.1 
304387 183612J 
304398 10169.1 



290 



WO 02/30268 



PCT/USO 1/32045 



318885 94880,2 
303841 79133J 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



303889 
319539 
318905 
320187 
316998 
319635 
319699 
319713 
319761 
319764 
319808 
321040 
320409 



1777183.1 

63198J 

1536408J 

396254J 

65715J 

163534 J 

74719BJ 

1699356 1 

75324.2 

88596.1 

7069.3 

193331J 

43709.1 



319881 
320488 
321121 
321205 
321253 
314043 
320630 
313435 
313443 
313472 
321348 
314138 
320712 
321383 



1585983.1 

368456.1 

1545647.1 

81249J 

375160.1 

155125.1 

17685.2 

443527.1 

82292.1 

8281 1J 

41762.1 

179960.1 

57156^2 

41924.1 



312996 187327.1 

306513 

306537 

306557 

306598 

306620 

306700 

308078 

306813 

306830 

306855 

329722 d4_p2 

329728 C14_J>2 

306890 

308100 

308147 

306929 

308352 

308383 

308521 

308561 

308617 

308771 

308828 

308898 

303019 41850J 

303084 4421 1_1 

305092 AA642912 

305169 

305177 

305235 

305413 



AA742999 Z43272 AA345258 AW956677 AA031942 

W19657 BE616760 BE259848 BE382680 BE615587 AI934464 AA322745 T07155 AW961 174 AA307302 Z41888 AA621992 

AA188400 AW770608 A! 147458 Al 148408 AI69S291 AA972591 

T19204T36109T36107 

R09Q27 AA344892 AA329574 AW955648 AW978708 AI567804 AI378935 AW014657 AI804134 R08922 N92947 BE546788 

F08365 Z43395 R54298 

T99949 AA654769 AA664550 AW975264 

Z44268 H06384AV655948 

R17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 K05030 AI142105 R12654 

AI458682 H24240 R14537 R18426 AW867082 

R24204R15712T64695 

AW630974 BE005208 R84237 AA724997 AA334867 AW955777 R18816 

AA019827R18947H46852 

T58960 AA609180 AA621 130 AI927236 AA431075 

AA261830 AW967855 H26953 AA262478 

AA226869 AA296516 AW959753 AA186390 AL359619 AA356195 AA148427 R22748 AI033624 BE548853 H95327 
AW579751 BE561649 AA397533 BE617136 AA236444 T89946 AA247450 N55777 W38725 AI743846 AI808406 AA922229 
AI051464 W04713 R1 1251 W19656 A 1 0423 19 AA489276 AI224533 H95274 AW26995S T8931 1 AI890088 AI362754 
AI830968 AI669336 AI589780 AA534557 AW273839 AI338155 AI126632 N83542 BE046048 AA807028 AA848107 
AW167978 AA976930 AA148428 AI289304 A1524262 AI625961 AA773469 AI222288 A1280054AI242371 AA227222 
AA973329 AA296517AA829436 AA234526 AJ149769 AI567865 AA936939 AI590681 AW469308 AI689531 AA486419 
AI422051 AHJ57252 AA626941 AI475352 AW247913 At222370 AA670122 AW198034 AA486418 AI363794 AA330739 
H51299 H44619 H46391 R86024 H51892 T72744 
AI817336 R32883 AA595590 AI743065 R31386 
W23285 H42714 F25381 F37215 
AA002047 N72537 K54142 H81580 
AA610649A1699484 H59558 
AA827082 AA732246 AA167611 AA830741 

AA199847 AA410224 R53323 AW836567 AW936569 AW936568 AW936571 

AA769123 AAB31715 AW977666 W92553 

AA005125 W95019 W93335 AA249037 

AA007374 AA007468 AI816886 

Z49979D61703U30168 

AA74C616 AA654854 AA229923 

R66867 R65678 R82673 W7312B R83101 

AW968556 AJ238555 AW968731 AJ002574 AA459446 H70260 AW977557 AA767351 AW268572 AA810719 A1698677 

AI300460 AA907450 AA649224 T07415 AI536896 BE018515 AI279865 BE047421 

AW368634 AI702169 AI245179 AW368646 BE545574 AA249018 AW368633 N27553 

AA989230 

AA991705 

AA994530 

A1Q00320 

AI000929 

AI022056 

AI472621 

AI066544 

AJ075803 

AI083982 



AI092235 
AI475949 
AI498991 
AI124514 
AI610791 
AI624497 



AI701559 
AI738720 
AI809301 
AI824829 
AI858667 

AF098363AF098365 
AF174008 AF174027 AF174106 

AA663131 
AA663591 
AA670480 
AA724659 
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305849 


AA861571 


305854 


AA862733 


307113 


AI183686 


307130 


- A1185234 


305937 


AA883238 


305977 


AA887293 


307451 


AI248615 


307513 


AI274307 


307846 


AI364186 


307871 


AI368665 


307831 


AI370434 


307932 


AJ230822 


307944 


AI418246 


307954 


AJ 4 19692 


307965 


AI421641 


309245 


AI972447 


309271 


AI986221 


309355 


AW072861 


309372 


AW074330 


309435 


AW090537 


309506 


AW137700 


309536 


AW151933 


309709 


AW242630 


325417 C12JW 




325450 c12_hs 




325452 c12_hs 




309815 


AW292760 


309839 


AW296076 


309849 


AW297444 


309906 


AW339340 


302705 31765J 


U09060U09061 


304037 


T26438 


304039 


T47349 


304236 


W93278 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA490934 


304787 


AA582678 


304921 


AA603092 


327819 C.5J)S 




304968 


AA614308 


306382 


AA968967 


331263 47479J 


AW780192AA015718W02571 


332252 1663987J 


N63882 T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Qenbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLpositlon: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332807 Dunham, I. etaL 

332808 Dunham, I. etat. 
332812 Dunham, Letal. 
332901 Dunham, I. eta). 
333149 Dunham, I. eLai 
333916 Dunham, I. etat 
334026 Dunham, Letal. 
334061 Dunham, Letal. 
334073 Dunham, I. etal. 
334150 Dunham, I. etaL 
334379 Dunham, I. etaL 
334719 Dunham, Letal. 
334773 Dunham, I. etaL 
334893 Dunham, Letal. 
334935 Dunham, I. etal. 
335146 Dunham, I. etal. 
335320 Dunham, I. etal. 
335568 Dunham,!. etaL 
335586 Dunham, I. etaL 
335601 Dunham, 1. etal. 
336036 Dunham, Letal 
336123 Dunham, L etaL 
336268 Dunham, L etaL 
337173 Dunham, L etaL 
337460 Dunham, I. etal. 
337685 Dunham, Letal. 
337736 Dunham, Letal. 
337780 Dunham, I. etal. 
337965 Dunham, I. etal. 
337976 Dunham,!. eLai 
338030 Dunham, L etaL 
338112 Dunham, L etaL 
338165 Dunham, I. etal. 
338178 Dunham,!. etaL 
3384Z7 Dunham, Letal. 
338506 Dunham, Letal. 
338794 Dunham, I. etal. 
338910 Dunham, Letal. 
339047 Dunham, I. etal. 
332864 Dunham, I. etaL 
332933 Dunham, I. etal. 
333193 Dunham, L etaL 
333712 Dunham, Letal. 
333940 Dunham, Letal. 
333942 Dunham, I. etal. 
334287 Dunham, Letal. 
334387 Dunham, Letal. 
334487 Dunham, Letal. 
334913 Dunham, I. etal. 
335109 Dunham, Letal. 
335250 Dunham, I. etal. 



strand 


n {.position 


□Inn 

rlUS 


297605*29/000 


Plus 


298277-29B360 


Plus 


3Q9o88*3 10561 


Plus 


1841954-1842090 


Plus 


3574317-3574413 


Plus 


8298994-8299169 


Plus 


9196549-9196681 


Pius 


9686941-9687077 


Plus 


9792201-9792374 


Plus 


10529221-10529854 


Plus 


13908356-13908467 


Plus 


15778859-15779026 


Plus 


16235169-16235328 


Plus 


19302753-19302881 


Plus 


20108247-20108373 


Plus 


21491292-21491457 


Plus 


22542132-22542246 


Plus 


24935021-24935655 


Pius 


24990333-24990497 


Plus 


25044923-25045157 


Plus 


29019798-29019877 


Pius 


30051089-30051186 


Plus 


31897555-31998040 


Plus 


23624127-23624224 


Plus 


32536159-32536395 


Plus 


3547161^3547245 


Plus 


3850500-3850643 


Plus 


41137934113990 


Plus 


7034267-7034392 


Plus 


7166011-7166119 


Pius 


8072708-8072827 


Pius 


10391398-10391600 


Plus 


12205719-12205875 


Plus 


12800037-12800181 


Pius 


19685043-19685354 


Plus 


21221871-21221953 


Plus 


27114697-27114763 


Plus 


26795375-28795551 


Plus 


30760793-30760968 


Minus 


1390386-1390296 


Minus 


2035790-2035681 


Minus 


3832993-3832494 


Minus 


7286177-7286073 


Minus 


85238306523671 


Minus 


8552629-8552330 


Minus 


13294116-13293871 


Minus 


13946021-13945781 


Minus 


14432191-14432132 


Minus 


19463909-19463815 


Minus 


21325792-21325667 


Minus 


21952922-21952826 
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335288 Dunham, I. elal. Minus 

335290 Dunham, I. elal. Minus 

335549 Dunham, I. eta!. Minus 

335662 Dunham, I. eta!. Minus 

S 335884 Dunham, I. etal. Minus 

335905 Dunham, I. etal. Minus 

336205 Dunham, I. etal. Minus 

336276 Dunham, I. etal. Minus 

336433 Dunham, I. etal. Minus 

10 336605 Dunham, I. etal. Minus 

336616 Dunham, I. etal Minus 

336679 Dunham, I. etal. Minus 

337043 Dunham, I. etal. Minus 

337272 Dunham, 1. etal. Minus 

IS 337357 Dunham, I etal. Minus 

337393 Dunham, I. etat Minus 

337497 Dunham, I. etat Minus 

337646 Dunham, I. etat Minus 

337920 Dunham, I. etat Minus 

20 338083 Dunham, I. etal. Minus 

338220 Dunham, I. etat Minus 

338752 Dunham, I. etal Minus 

338763 Dunham, L etat Minus 

338933 Dunham,). etat Minus 

25 339209 Dunham, I. etat Minus 



325240 5866848 Minus 

329532 3983505 Plus 

329522 3983507 Minus 

329519 3983510 Plus 

30 329511 3983514 Plus 

325326 5866875 Plus 

325303 5866908 Minus 

325389 5866921 Plus 

325417 5866925 Minus 

35 325450 5666941 Minus 

325452 5666941 Minus 

325498 5866967 Plus 

325587 6682462 Plus 

325602 5866994 Plus 

40 325701 5867028 Minus 

325780 6381953 Plus 

329722 6065785 Minus 

329728 6065785 Minus 

329666 6272129 Phis 

45 329815 6624888 Minus 

329841 6672062 Minus 

325824 5867048 Minus 

325866 5867076 Minus 

325902 5867101 Minus 

50 325958 5867142 Plus 

326014 5867160 Minus 

329941 6165199 Minus 

330002 6623963 Plus 

326154 5867170 Minus 

55 326023 5867245 Plus 

326278 5867269 Pius 

330036 6042048 Plus 

326547 5887307 Minus 

326495 5867423 Pius 

60 326507 5867435 Minus 

326505 5867435 Minus 

326506 5867435 Minus 
326530 5867441 Minus 
326508 6682496 Pius 

65 330120 6671864 Minus 

330123 6671869 Minus 

326858 6552462 Minus 

326983 5867657 Minus 

327014 5887664 Pius 



22304275-22303770 

22309950-22309891 

24666203-24666128 

26690300-26690125 

26694537-26694382 

26988888-26988719 

30477456-30477311 

32093320-32093181 

3406754O-34067425 

15616509-15616358 

26021027-26020848 

2035790-2035681 

17407330-17407251 

28241476-28241307 

30906179-30906109 

31471747-31471569 

33371317-33371258 

2648689-2648632 

6051648-6051510 

9318438-9316301 

14166440-14166104 

26421374-26421135 

26628148-26828009 

29908865-29908702 

32492953-32492593 

32301-32650 

42937-43014 

35265-35458 

18407-18597 

20965-21325 

47726-48024 

73556-73630 

239672-239759 

110635-110745 

435379435552 

704103-704202 

173372-173930 

126724-126967 

79122-79251 

72936-73046 

63634-63873 

112713-112992 

207544-207741 

98307-98446 

68431-68720 

40181-40331 

42450-42833 

94358-94628 

127729-127842 

53437-53550 

10358-10447 

34319-34411 

4609746158 

7103-7179 

171799-171896 

75250-75903 

117120-117216 

623677-623870 

11843-11930 

13038-13111 

8B18-8949 

9368-9509 

303000-303122 

78904-79112 

127553-127656 

35311-35406 

69337-69670 

16023-16581 

1017630-1017788 
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326930 6456782 
326920 6456782 
327058 6531965 
327061 6531965 
327075 6531965 
327120 6531970 
330126 6093735 
327157 5866841 
327183 5867442 
327192 5867445 
327288 5867481 
327469 5867772 
327489 6004459 
327526 6381882 
327574 5867818 
327665 5867839 
327752 5867949 
327819 5867968 
327796 5867982 
330260 6671884 
330282 6671910 
328078 5868X8 
328121 5868031 
328190 5868077 
328227 5868105 
327871 5868131 
328018 5902482 
328624 5868246 
328744 5868290 
328799 5868316 
328291 5868363 
328329 5868375 
328369 5868386 
328385 5868395 
328397 5868397 
328412 5868405 
328538 5868485 
328656 6004473 
328638 6004473 
328903 5868514 
328960 6456775 
330320 5932415 
328993 5868536 
329081 5868602 
329089 5868614 
329109 5668626 
329192 5868716 
329216 5868726 
329224 5868728 
329246 5868732 
329415 5868874 
329454 5868887 



Oil in 
PIUS 


O AAA (TAJIfVTTnC 


Minus 


41425-42519 


Qtl HI 

PIUS 


2oo4^o-23o4o35 


Minus 


AJQCOOA GAOC6T7Q 

o4ooootf-o4oDo73 


Plus 


AfiAAGAO Af\A4A*i4 

4041318-4041431 


Minus 


6-1088 


Oil 1M 

PIUS 


DO ACQ 04C44 

824oo-B2623 


Minus 


44084746 


Plus 


84317-o4531 


ft IImi 

Minus 


184652-184764 


Plus 


4oo8o-4o773 


Plus 


145549-145708 


Mmus 


577TO-58015 


Minus 


97010-97123 


Plus 


68767-69126 


Plus 


141736-141900 


Plus 


93721-94421 


Minus 


92202-92717 


Plus 


85267-85405 


Plus 


45203-45269 


Plus 


AflAA 4 4\ 4\ A 

3982-4114 


Plus 


72807-72865 


Plus 


153782-153850 


Plus 


21082-21165 


Minus 


21082-21242 


Minus 


88889-89221 


Minus 


542547-543133 


Minus 


120666-120836 


Pius 


138639-138722 


Minus 


80771-80923 


Minus 


144244-144434 


Plus 


191709-192239 


Plus 


7537V75583 


Plus 


369952-370155 


Plus 


344967-345063 


Pius 


86427-86519 


Pius 


3814-4243 


Plus 


792616-792729 


Plus 


294618-294903 


Plus 


23625-24468 


Plus 


38547-38837 


Minus 


54458-54697 


Plus 


49160-50084 


Plus 


93368-93510 


Plus 


25805-26923 


Plus 


102168-102273 


Plus 


166936-167020 


Minus 


71408-71707 


Plus 


27422-27664 


Minus 


250541-250792 


Plus 


1011438-1011818 


Plus 


51342-51593 
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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey: Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unlgene number 

UnigeneTitle: Unlgene gene title 

EosCode: Internal Eos name 

Localization: Predicted cellular localization of gene product 



Pkey ExAccn UnigenelD Unigene Title 



EosCode Localization 



100394 D84276 Hs.66052 CD38 antigen (p45) PBC1 

100452 D87742 Hs.241552 KIAA0268 protein PAB7 

101249 L33681 Hs.1904 protein kinase C, lota OAA1 
101485 M24735 sele cb'n E(endotheiiaJ adhesion mo ta! ACC5 

101514 M28214 Hs,123072 RAB3B, member RAS oncogene family PFJ2 

101851 M94250 Hs.82045 midkine (neurite growth-promoting factor LBH9 
102398 U42359 gb:Human N33 protein form 1 (N33) gene, PDG3 

102522 U53347 Hs.183556 solute carrier family 1 (neutral amino a PFJ4 

102669 U71207 Hs.29279 eyes absent (Drosoprfla) homotog 2 LEM9 

103119 X63629 Hs.2877 cadherin 3, type 1, P-cadherin (placenta LBG2 

103709 AA037316 Hs.13804 hypothetical protein <W4620235 PD06 

104080 AA402971 Hs£7771 kaUIkreln 11 PBA6 

104144 AA447439 Hs.183390 hypothetical protein FU13580 PDM3 

104691 AA011176 Hs.37744 Homo sapiens beta-1 adrenergic receptor PAV1 

105370 AA236476 Hs.22791 transmembrane protein with EGF-iike and PDM9 

106149 AA424881 Hs.256301 hypothetical protein MGC1 3170 PD08 

106579 AA456135 Hs.23023 ESTs PAA4 

107102 AA609723 H&30652 KIAA1344 protein PAA3 
107217 D51095 DKFZP586E1621 protein PDG8 

108153 AA054237 Ha.40808 ESTs PBF1 

109014 AA156790 Hs.262036 ESTs, Weakly similar to Z223_HUMAN ZINC 

109112 AA169379 H&257924 hypothetical protein FU13782 BCU4 

109890 H04649 Hs£0843 Homo sapiens cONA FU1 1245 fis, clone PL 

110151 H18836 HS31608 hypothetical protein FU20041 PAV9 

112971 T17185 Hs.83883 transmembrane, prostate androgen induced 

113021 T23855- Hs.129836 KIAA1028 protein PD03 

114908 AA236545 Hs.54973 cadherin-Cke protein VH20 PFJ6 

114965 AA250737 Hs.72472 ESTs BCY2 
116393 AA599463 hypofoetical protein MGC2648 PDV3 

116416 AA609219 Hs.39932 ESTs OAB6 

117698 N41002 Hs.45107 ESTs PDT9 

117984 N51919 Hs.106778 ATPase, Ca++ transporting, type 2C, memb 

118985 N94303 Hs.55028 ESTs, Weakly similar to I54374 gene NF2 PDM8 

119018 N85796 Hs.278685 Homo sapiens prostein mRNA, complete cds 

119126 R45175 Hs.117183 ESTs PBF8 

120992 AA398246 Hs.97594 KIAA1 210 protein PDG5 
121710 AA4 19011 prostate androgen-regulated transcript 1 POV5 

121913 AA428062 ESTs; protease Inhibitor 15 (PI15) BCU7 

122041 AA431407 Hs.98732 Homo sapiens Chromosome 16 BAG clone CIT 

122593 AA453310 Hs.128749 alpha-methylacyt-CoA raoemase PD01 

123209 AA489711 Hs^03270 ESTs, Weakly similar to ALU 1 .HUMAN ALU S 

124526 N62096 Hs.293185 ESTs, Weakly similar to JC7328 amino aci PAV4 
126399 AA1 28075 transmembrane, prostate androgen Induced 

126645 A1167942 Hs.61635 sbt transmembrane epithelial antigen of PAA5 

126966 R38438 Hs.182575 solute carrier family 15 (H+Tpeptide tra PD05 

127537 AA569531 Hs.162B59 ESTs PAA6 

128790 AA291725 Hs.105700 secreted frbated-related protein 4 BCX2 

129109 AA491295 Hs.108708 caidum/caimodulln-dependent protein Wn PFJ7 

129184 W26769 Hs.109201 CG1-88 protein PAV6 
129389 AA621604 spondtn 2 r extracellular matrix protein CJA5 
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plasma membrane 
not determined 
cytoplasmic 
plasma membrane " 
cytoplasmic 



plasma membrane 
cytoplasmic 
plasma membrane 



plasma membrane 
plasma membrane 

plasma membrane 
not determined 

plasma membrane 
PDG7 

not determined 
P0G4 

plasma membrane 
CHA1 not determined 



mitochondrial 
secreted 

ER 

PAJ5 not determined 
- PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 

plasma membrane 
plasma membrane 
not determined 
secreted 

vesicular 
not determined 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



129404 AA172056 
129534 R73640 
130760 AA128997 Hs.18953 
131425 AA219134 Hs£6691 
132964 AA031360 
132987 AA032221 
133179 U81699 
133330 U42360 
133520 X74331 
133724 U07919 
133724 U07919 
133944 AA045870 Hs.7780 
134110 U41060 Hs.79136 



Hs/11260 



H3.61635 
Hs.66731 
Hs.71119 
Hs.74519 
Hs.75746 
Hs.75746 



ESTs 

hypothetical protein FU1 1264 

phosphodEesterase9A 

ESTs 

ESTs 



PAB4 
PAJ3 
PEE8 
PBA7 
PAA7 
lOf PM17 

homeo box B13 PFJ5 
Putative prostata canoer tumor suppresso PDM1 
prlmase, polypeptide 2A (58kD) PDM2 
aldehyde dehydrogenase 1 famSy, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cDNA DKFZp564A072 (fr 
LIV-1 protein, estrogen regulated BCR4 
301805 AI800004 Hs.142846 hypothetical protein PEU4 
302005 AJ869666 Hs.123119 MAD (mothers against decapentapleglc, DrPBJ6 
302881 AA503353 Hs.105314 relaxin1(H1) PBH3 
303506 AA340605 Hs. 105887 ESTs, Weakly similar to Homolog of rat 2 PEG4 
303599 D30891 Hs.19525 hypothetical protein FU22794 PBM4 
303753 AW503733 Hs.9414 KIAA1488 protein PBY3 
308050 AI460004 Hs.31608 hypothetical protein FU20041 PEU5 
310382 AI734009 Hs,127699 KIAA1603 protein PCQ8 
310431 AI42Q227 Hs.149358 ESTs, Weakly similar to A4601 0 X-Hnked PBH1 
310573 AW292180 Hs.156142 ESTs PEN3 
310598 AJ338013 Hs.140546 ESTs PCW3 
310816 AI973051 Hs224965 ESTs PET5 
311598 AI682088 Hs.79375 holocarboxylase synthetase (btotm-lprop PBH8 
313676 AA861697 Hs.120591 ESTs PBY2 
314121 AI732100 Hs.187619 ESTs PBY1 
314691 AW207206 Hs.136319 ESTs BFF8 
314785 AI538226 Hs32976 guanine nucleotide binding protBin 4 CB07 
314907 AJ672225 Hs£22886 ESTs, Weakly similar to TRHY_HUMAN TRICH 

315051 AW292425 ESTs PBM9 

315052 AA876910 Hs.134427 ESTs PBJ7 
316442 AA760894 Hs. 153023 ESTs PBJ9 
317548 AI654187 Hs.195704 ESTs PBQ6 
317869 AW295184 Hs.129142 deoxyribonuclease II beta PBQ7 
318524 AW291511 Hs.159066 hypothetical protein FU10188 PBJ1 
319191 AF071538 ' prostate epithelium-specific Ets transcr PEN1 
319763 AA460775 Hs.6295 ESTs, Weakly similar to T17248 hypotheti PE07 
320324 AF071202 Hs.139336 ATP-bindlng cassette, subfamily C(CFTRPBH5 
320561 NM_006953H&159330 uroplakIn3 PEL9 
320796 AF038966 Hs.31218 secretory carrier membrane protein 1 PBY4 
321441 AW297633 Hs.118498 Homo sapiens LUCA-15 protein mRNA, splic 
322303 W07459 Hs.157601 ESTs CBF9 
322782 AA056060 Hs.202577 Homo sapiens cDNA FU12166 fis, clone MA 
322818 AW043782 Hs293616 ESTs PCQ7 
323228 AF055019 H&21906 Homo sapiens clone 24670 mRNA sequence 
323287 AA639902 Hs.104215 ESTs, Moderately similar to SPCN_HUMAN S 
324295 A1146688 Hs.143691 ESTs PBQ9 
324430 AA464018 Hs. 184598 Homo sapiens cDNA: FU23241 fis, clone C 
324603 AW016378 Hs.292934 ESTs PBM3 
324617 AA508552 Hs.195839 ESTs, Weakly similar to I38022 hypotheti PBH4 
324626 AI685464 gb:tt88f04jc1 NCI_CGAPJ > r28 Homo sapiens 
324658 AI694767 Hs.129179 Homo sapiens cDNA RJ 13581 fis, done PL 

Hs.1 16467 small nuclear protein PRAC CBK1 

PBJ2 

Hs.299867 guanine nucleotide binding protein 4 PEW1 
330762 AA449677 Hs. 15251 hypothetical protein PBM1 
330790 T48536 Hs.122764 TMPRSS2, transmembrane protease, serine 
330892 AA149579 Hs.91202 ESTs PBQ4 
331099 R36671 Hs.14848 Homo sapiens mRNA; cDNA OKFZp564D018 (fr 
331490 N32912 Hs.291039 ESTs PCI4 
331689 AA431407 Hs.988Q2 ESTs, Moderately simflar to T14342 NSD1 PBH7 
332247 N58172 gb:za21f09.s1 Soares fetal Bver spleen PBQ5 

332395 AA340504 gb*w31a09jc1 NCLCGAPJ<idt1 Homosaplen 

332697 T94885 transgelln 2 PBQ8 

332793 . PBH2 

334447 PBY9 
338255 PBY7 



324718 AI557019 
330211 

330546 U31382 



secreted 
nuclear 

plasma membrane 
plasma membrane 
nuclear 

plasma membrane 

PDT1 mitochondrial 
PDT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytoplasmic 



not determined 
not determined 
plasma membrane 

plasma membrane., 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 
secreted 

PBQ1 not determined 
plasma membrane 
PCI2 not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
-PCW6 

PBJ4 plasma membrane 

nuclear 

not determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQIcytoplasmlc 

nuclear 

not determined 
nuclear 

PBJ8 not determined 

secreted 

nuclear 

not determined 
not determined 
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401424 PFG2 

407122 H20276 Hs.31742 ESTs PEW7 

408430 S79876 Hs.44926 dipeptidylpeptidasa IV {CD26, adenosine PEZ3 

408826 AF216077 Hs.48376 Homo sapiens clone HB-2 mRNA sequence 

S 409262 AK000631 Hs.52258 hypothetical protein FU20624 PFG1 

409861 NM^005982Hs5441Q sine oculis horneobox (Drosophila) homob PEW3 

411096 U80034 Hs.68583 mitochondrial intermediate peptidase PEZ9 

413125 BE244589 Hs.75207 gfyoxalasel PFJ3 

413623 AA825721 Hs.246973 ESTs OBH6 

10 414422 AA147224 Hs.337232 HomeoboxA13 PFC6 

415263 AA948033 Hs. 130853 ESTs PEZ5 

417153 X57010 Hs.81343 "collagen, type II, alpha 1 (primary ost PFJ1 

418601 AA279490 Hs.86368 calmegln PFA1 

418848 AI820961 Hs. 193465 ESTs PEY4 

15 418882 NM_004996Hs.89433 ATP-btnding cassette, sub-famQyC (CFTROBH2 

419839 U24577 Hs.933Q4 •phosphoGpase A2, group VII (platelat-a PFH9 

421887 AW161450 Hs.109201 CGI-86 protein PFH2 

422083 NM.001141HS.111256 'arachidonate 15-Iipoxygenase, second ty PFH5 

424565 AW102723 Hs.75295 guanytate cyclase 1, soluble, alpha 3 PFA3 

20 425071 NrVL013989Hs.154424 'delodlnase, iodothyronlne, type II" PFH6 

425710 AF030880 solute carrier family, member 4 PFD4 

427958 AA4 18000 Hs.98280 potassium Intermediate/small conductance PFH1 

428819 AL135623 Hs.193914 KIAA0575 gene product PFD6 

429900 AA460421 Hs.30875 ESTs PEZ7 

25 429918 AW873986 Hs.119383 ESTs PEY5 

430226 8E245562 Hs.2551 adrenergic, beta-2-, receptor, surface PEZ4 

431217 NM_013427Hs.25G830 Rho GTPase activating protein 6 PFG6 

431716 D89053 Hs.268012 fatty-add-Coenzyme A ligase, long-chain PEZ1 

431992 NM.002742Hs.2891 protein kinase C, mu PFH4 

30 432189 ' AA527941 gb:nh30o04.s1 NClCGAP_Pr3 Homo sapiens 

432244 AI669973 Hs£00574 ESTs PEW8 

432437 W07088 Hs.293685 ESTs PFG3 

432966 AA650114 Hs.325198 ESTs PEY3 

439176 AI446444 Hs.1 90394 ESTs, Weakly similar to B28096 llne-1 pr PEWS 

35 440260 AI972867 Hs713Q copInelV PEW6 

440901 AA909358 H&128612 ESTs PFC8 

445424 AB028945 cortactin SH3 domain-binding protein PEZ6 

448320 AF126245 Hs.14791 "acyl-Coenzyme A dehydrogenase family, m 

447210 AF035269 phosphatidytserine-speciffc phosphotipas PFH8 

40 449156 AF103907 Hs.1 71 353 prostate cancer antigen 3, non-coding DO PEZ8 

449625 NM.014253 odz (odd Oz/ten-m, Drosophila) homolog 1 PEZ2 

449650 AF055575 Hs£3838 calcium channel, voltage-c^pendenUty PFD2 

451939 U80456 H&27311 single-minded (Drosophila) homolog 2 PFJ8 

451982 F13036 H&27373 Homo sapiens mRNA; cDNA DKFZjp56401763 (f 

45 452039 AI922988 ESTs PFD8 

452340 NM.002202Hs^05 ISL1 transcription factor, LIM/horneodoma PFG4 

452784 BE463857 Hs.151258 hypothetical protein FU21 062 PFC5 

452946 X95425 H&31092 EphA5 PFH3 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigeneE) in Table 
15. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
GenbankESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



116393 131543.1 



101485 18113.1 
126399 17331J 



132964 



94346J 
21074J 



129404 
107217 



AI9724Q2 A1634409 AI523716 AI799749 W44518 AI424438 AI688513 AI971048 AI686324 AW013854 AA588483 AA5281 1 1 AI627428 
AI582200 AI669298 A1826926 AI620526 A1669958 AI972458 AI924500 AA512903 W44517 AA335363 AW238997 BE300165 
BE250665 AA284195 AA523420 W52834 AI471970 AI952824 AW003820 AW009463 AA66979S AA1 14966 AI653342 AA1 15038 
AI342150 AI092100AI968211 W51994 A1804005 AI201420 AI123210 AI738405 A1874964 AI970341 AW027500 AI493316 A1333193 
AI139353 AA599463AI656163AI804200AI365321 AI990213 AI657011 AA650Q25 A1968810 AI341978 AA599839 AW592602 
AA644289 AI468578 A1565265 AI565226 BE221535 AW973052 

AA296520 AL021940 M30640 NM.000450 M24736 M61894 AL047443 H39560 AI694691 AA916787 AI214796 AA939085 AI150616 
AA412553 AA412545 AI051015 T?7654 AA694430 

AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 AI972096 AW071693 AI742327 AI377498 AIB04815 AJ 640802 
AI885001 AI921394 AA5951 15 N71820 AI921217 AW007283 AI467828 AI369306 AA917446 AI493698 AA088701 AA126899 AI93622B 
AW204238 AI039567 AI925027 BE138909 AW452945 AW135998 AA310984 AA027860 AW073519 AI537597 AA953976 AI521 341 
AW273569 AW050740 AA536113 AA559064 AI474392 AW135709 AA535181 AW572959 AA570597 AI905464 AI677810 AI587642 
AW975102 AA424310AA482527 N64192 AA658276 AW889117AA486591 AW889172 A1381990 AI381991 A1673419 AI990950 
AA487031 AI272934 AI150565 AA229168 AW316722 A1142707 BE222396 AA8141 68 AA1 22028 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AI250993 BE146418 AA122025 
AI362575 AI805082 AW263421 AI432462 AA135870 AA031360 AA031604 AA298475 AA298464 

NM.012445 AB027466 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 
AW007762 AI341557 AI799666 AI97271 0 AI377966 AI962810 AI084783 A1458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA6501 88 At1 23203 AI122890 AI280975 W73595 W73495 AI863238 AA374109 AA803986 AW149089 AW957523 
AI307748 AI921067 AI336463F24537Ae80460AI367500 AI189309 AI814701 AI766921 AW572106 AA037024 AW072576 AA578293 
A1288103 AA235464 AW450642 AA574230 AW294024 A1589229 A1580733 AW512227 AA877009 AI660255 AW1 88597 AA558228 
AI572782 AA658397 AI274828 AI868359 AA864573 AI264439 AA621604 AW515493 AW243333 Z39737 AI567038 AA573997 
AA573559 AW236431 AI652870 AI684973 AA034505 AA047126 

AI267700 AI720344 AA191424 AI023543 AI469633 AA172056 AW958465 AA172236 AW953397 AA355086 
AL080235 AA031750 D81382 AW80231 AI095947 A1560953 BE010721 AI870290 AA374945 AA125792 D51527 D51558 A1685541 
051559 AW1 17288 AA195741 AIB75138 AW593439 A1201885 T30590 AW952100 D51095 AA523864 W70043 AA987586 AI421515 
A1205532 AA127069 A1337367 D51595 AI453785 AW075677 AW088359 C14287 C14284 

AF163474 NM.016590 AF163475 AI761105 AI770098 AA410580 AA411616 AI590343 Af739050 AL050198 AI862645 AA419104 
AA513809 AA333032 AI816915AW139625 AA640889 A1311391 AI627693 AW1 35514 AA419011 AI269149 AI245259 AI970008 
AI970017 AW139445 AA569503 A1761072 Al 768 179 AI759995 AI300776 AI870129 AW150770 AA226S01 AA226220 . 
AI249368 AI742316 AA428062 AA442089 AI864189 BE349478 AI803475 A1584049 BE552085 AI088609 AI264197 AI886144 AI129474 
AI307145 BE181300 AW0584O3 AI696838 AW748598 AA442196 A1216428 
entre*_U42359U42359 

347217.1 AW292425 BE467167 AI702953 BE550961 BE222309 AI299348 Al 693336 AA541708 
AI685464 AW971336 AA513587 AA525142 

NM 012391 AF071538AB031549AI685592AI745526AA662204 AW130657AA662164AW971121 A1668916 AA513274 AI991223 
AI979170 AW298436 AA639821 AI859010 AW513942 AI687669 AA662521 AA548598 AI345056 AI305374 BE043418 AI432856 
AI334840 AI379796 AI492693 AI307915 BE042082 A1307834 AI307858 AI309488 BE042210 AI435670 AI371605 A1862491 A1284563 
AI306872 A1255044 AI254601 AI251236 AI473073 AJ473042 AI432760 AI435664 A1336826 AI289365 AI369096 AI862274 A1334871 
AI349863 AI250405 AI377617 AI309895 AI313017 A1862291 A131 1936 AI378718 AI305722 AI306769 AI308888 AI334565 AI862296 
AI344230 AI435685 AI344C87 AI378S96 AI31 1209 AI435775 AI31061 1 AI311154AI432289 A1431561 AI492681 A1432667 AI335288 
AI492796 AI432769 AI310299 AI432273 AI379820 AI275319 AI435753 AI609441 AI432767 AI369100 AI311420 AI349974 AI247157 
AI334877 AI270910 AI224320 AI30S608 AI334489 AI377152 AI350012 AO70086 AI335053 AI306781 AI306750 A1334849 A1334874 
A1340380AI307876 AI3G5974 A13Q5972 AI311521 AI334872 AI862509 AI311498 A1335051 AI289684 AI310859 AI311862 AI862483 
AI492775 A1307906 AI492708 A1289693 AI340373 A1307910 AI31 1359 AI435653 AI334865 AI31 1492 AI492809 A1492690 AI431576 
A1862268 AJ31 1879 AJ308435 AI482792 AI862512 A1275321 AI431568 AI431564 AI307885 AI307926 AI435692 AI435778 AI310182 
AI308894 AI492707 AI49271 3 AI30856O AI307829 A1343234 AI580598 AW472786 A1340918 AJ310243 AI309368 AI307920 AI289665 
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156454J 
8836.1 



121710 19268.1 



121913 291015J 

102398 
315051 

324626 33641 1_1 

319191 16065.1 
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338255 
330211 
332798 
334447 
332247 

332396 20265.1 



332697 13699J 



425710 
432189 
445424 



447210 
449825 
452039 



AI306777 AW08631 8 AW086292 AW086378 AK310027 AI275293 A1369082 AI340900 A1306749 A1371558 AW086287 BE043803 
AI306793 AI306272 AI287B48 AI270917 AI284816 AI336813 AI284546 AI308044 AI275290 AI270372 AI306795 AI289687 A1223570 
AI305303 AI289677 AI287742 AI275284 AI306812 AI336701 AI371554 AI378719 AI344988 AI223631 AI335141 AI343222 AI284568 
A1305357 AI275270 AI345932 AI436549 AB07925 AI31 1502 AI344238 AI343182 AI308508 AI305988 A1270790 AI379792 A1305647 
A1305410 AI432251 AI436517 A1343227 AI305534 A1340387 AI271043 A1305499 AI271046 AI305962 AI289465 AI305378 AI289725 
AI310848 AI305848 AI289362 AI252964 AI307049 AI310831 AI306993 AI306796 AI224659 AJ305969 AI349855 AI306164 AI306948 
AI284676 AI309155 AI343202 AI432785 AI306815 AI369081 AI270885 AI288699 AI435704 A1309647 AI305716 AI31 1281 AI287927 
AJ472995 AI340423 AI270958 AI307069 A1305364 A1270807 AJ275306 AI311 890 AI275263 AJ432750 A1289371 AI432861 AI255113 
AI305709 AI473008 AI311168 AI309711 AI377164 AI271201 AI289560 AI309710 AI306 195 AJ3 11201 AI287741 A1271066 A1432676 
AI275281 AI379795 AI472972 A131 1967 AI306826 AI305465 A1270792 A1473019 AI305340 AI270922 A1305995 AI305462 AI254144 
AI270969 AI473012 AI305390 AI275278 AI223644 AI289692 AI250318 A1305372 A1289691 A1250521 A1306283 AI306814 AI307933 
AI473160 AI432803 AI223720 A1254979 AI334862 A1306926 AI209541 AI432248 AI435722 AI435698 AI432859 AI310683 A1473175 
AI335144 AI289467 A1436489 A1306928 AI473033 AI305763 AI307868 AI307882 AI348959 A1435736 AI432857 AW32896 AI435735 
A1432283 AI473086 AI432883 AW73081 AI432825 AI307840 AW73164 A1432885 AI473166 AI472982 AI435734 A1473060 AI473171 
AI432279 AI432882 AI334670 AI436512 AI432827 AI432652 AW73051 AW73077 A1435697 AI271509 AI492781 AI472983 AI473018 
AI432897 AI473043 A1432871 AI436536 A1473157 A1349715 AI432777 AI473016 AI473158 AI340369 A!307941 AI432773 A1377146 
AI492791 A1270950 AI3Q5342 A!284604 A1306269 AI284811 AI270811 AI289347 AI334869 AI334852 AI311759 A1250382 AI309520 
AI289550 AI305721 AI340870 AI270901 AI308578 AI307904AI340715 AI270941 AI309B08 AI246867 AI473014 AI307039 AI289360 
AI473069 AI492786 AI344013 AI305876 AI436510 AI340742 A1473Q28 AI307891 BE041871 BE041268 BE042340 BE041946 
BE041783 AI306173 AI201948 AI926972 AI275769 
CH2a_6856FG_UNK_afl:AC00 
C_5_p2 

CH22J4FG.6J5JJNK.C4G1 .Q 
CH22J746FG.387_7JJNK.EM 

372969.1 AA669097 AA5 13815 AA026798 AA676526 AA704429 AA704269 AW118292 AA579216 N58172 

AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW36781 1 AW367798 R17370 A1908947 
AA382932 R58449 H16732 AA371231 AW952899 AA713530 AWB92946 R53463 H11063 AW068542 Z40761 BE176212 8E176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 
BE463983 AJ805213 AI761264 W94885 N945G2 AI623772 AJ 4 19532 AI810302 AJ 634190 AW002516 AW150777 AI352312 AJ367474 
AW204807 AI675502 A1337026 AW134715 BE328451 AI123157 AI560020 AI300745 AI608631 A1248873 AA742484 AW051635 
H18646 AI245045 AA5071 1 1 AI640510 AI925594 AA1 15747 AA143035 AA151 106 

X51405 NM.001873 T1 1322 AL118886 BE328175 AW1 36009 BE467445 AW470313 AA774852 BE504139 AW501046 AA082792 
AW389231 AA370044 R38841 AA371457 C04813 R25791 R25558 AW895854 AW903819 AW895671 AW895677 BE159723 
AW895664 AW895597 AW895595 AW695665 AW888518 AI903724 F06091 F08503 AL1 19462 AW895730 AW88851 6 R2651 1 
R26489 AA334126 AA327626 N85713 AW895998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566 AA330159 AI922855 AA383512 AA029603 D82246 082171 T94933H56545AA348060 
AA176888 R96764 AW451817 AA385766 AA452618 AI690057 AA988822 BE549928 AA150901 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA3699B2 AW385948 AA922466 N75882 AI422070 AI361256 AI680224 D57122 T94885 
R53266 R46713 T19071 AW796277 AA325333 F0471G F02334 AA358146 AA626597 AA358304 AWQ28099 AL1 19570 D57290 
D58273 D57796 N46555 AI361S69 AA329457 D57225 AW024046 AA992606 AWQ221 18 AW021538 AA935645 H69870 K56546 
AW961219 AA453239 AW837541 N45521 BE2 18029 AA318877 AA327740 AW961809 T92139 D53216 D52365 D53363 D53312 
D531 16 A1547267 AA679935 AW026552 AWQ26418 AW1 90507 AI927710 AW244103 050948 AW054991 AW021063 AW02251 1 
AA493436 AI365636 BE464751 AW149384 AA102442 AW771388 A1818251 AI126368 D51049 AI421542 A1559487 AW079779 
AW021048 AW023969 AW044214 A1458264 AA027274 AI620254 AW028817 BE219511 AA326242 N67561 AI971273 AA878328 
D57131 AA770662 AJ309299 A1796767 AA613338 W58076 A1566287 A1445573 AI880260 AA001 919 AW339259 A1492610 A149261 1 
R97692 A1301425 AA722603 D58361 AI350323 AA973926 AI431263 AA516126 AA865467 A1925177 N39443 AA001943 A1299371 
AI082412 AA665090 AA583433 H89871 AA977231 AI362219 AI056096 A1270446 N67524 N22103 AW614224 AA744054 AW243622 
AI613188 AI929173 AI35G243 AI362138 AA744004 AA176661 D56767 AI955625 A1393109 AI094769A1479726 AI423107 AI955617 
A1034036 A1582196 AW264534 AW18961 AA570761 AJ343538 AA650341 AA992503 AA770004 AL039666 AI862675 AW1 90335 
AA510274 AW41 8627 BE467472 D56786 128749 AI217610 AI359556 T23523 AL040189 AA846222 AA651636 D51280 AI888986 
AI521167 AI340177 AW612815 AI625285 AA621607 AA177059 AA229768 AA829788 AJ749682 AW190631 N75299 AA230089 
AI91S632 BE069542 AA890020 AA528397 AA995390 BE503860 AA570812 AW339396 AI197986 AI203725 AI282379 AA670375 
AA461513 F01728 AW243599 C00856 N75587 R95995 AA150932-R95961 AA648060 AA933800 AA927073 AA101126 AA864190 
T93566BE167472 

AF030880 NM.000441 AC002467 AA385554 H23053 AW891838 AJ 139 9 68 AA653057 A1695233 
AA527941 AJ810608 AJ620190 AA535266 

AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1855 T66098 T30174 AW954164 AW176301 AW748243 AA456428 
AI369958 AA938565 AW959613 Z42008 AA994779 AI683909 F1 1019 F10926 AJ769597 AI752550 T65015 AI884314 AA643854 
Z41838 AW020147AI038822 AW571822 AA299781 AA894928 AF131790 BE00541 1 AI902476 AW082695 AA464384 R42750 
AW9Q2301 AA464273 R05837 Z38294 H41Q98 AL134507 M86079 

AF035269 AF035268 NM.015900 T96213 U37591 AA156832 AA299371 AI084325 H9S977 AI765967 BE221465 AA156726 AI969563 
AW024539 AI436791 AI949451 AA843093 AI452756 AA824232 A1306667 T96131 AW207447 AW243558 AW957032 AI084332 
H95978 U30998 

NM.014253 AF100772 BE088769 AL022718 BE 161 779 AW863569 BE161640 AL039060 BE168542 AW296554 AA323193 AA235370 
AW779760 N48874 AI375997 R45432 D59344 AI203107 F07491 R35360 R25094 AI913631 A1498402 T61382 A1016320 N45526 
T61415AA331486 

89513.1 A1922988 H05475 AA021608 AW169947 AA913750Z41614 AW800012 



25529.1 

342819.1 

6391J 



7119J 



8113J 
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TABLE 1 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 

Pkey: Unique number corresponding to en Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et a!.* refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et a)., Nature (1099) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

Ntposition: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposiUon 


334447 


Dunham, 1. etal. 


Plus 


14308764-14308824 


332798 


Dunham, 1. etal. 


Minus 


232147-231974 


338255 


Dunham, 1. etal. 


Minus 


15242294-15242231 


330211 


6013592 


Plus 


59158-59215 


401424 


8176894 


Plus 


24223-24428 
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TABLE 11 AND SEQUENCE USTING 

SEQIDN0:1 BCU4 ON A SEQUENCE 

Nucleic Add Accession!: NM.Q24915 
5 Coding sequence: 13-1830 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I _ 

t _ ATTGGATCAA ACATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 

10 ATGCCCAGTG ACCCTCCATT C A ATACCCGA AG AGCCTACA CCAGTG AGG A TGAAGCCTGG 120 
AAGTCATACT TGGAGAATCC CCTGACAGCA GCCACCAAGG CCATGATGAT CATTAATGQT 180 
GATGAGGACA GTGCTGCTGC CCTCGGCCTG CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 
AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 
TGCCTTGGCA CCAGTGAAGC CCAGAGTAAT TTGAGTGGAG GAGAAAACCG AGTGCAAGTC 360 

15 CTAAAGACTG TTCCAGTGAA CCTTTCCCTA AATCAAGATC ACCTGG AGAA TTCCAAGCGG 420 
G AACAGTACA GCATCAGCTT CCCCG AG AGC TCTGCC ATCA TCCCGGTGTC GGGAATCACG 480 
GTGGTGAAAG CTGAAGATTT CACACCAGTT TTCATGGCCC CACCTGTGCA CTATCCCCGG 540 
GGAGATGGGG AAGAGCAACG AGTGGTTATC TTTGAACAGA CTCAGTATGA CGTGCCCTCG 600 
CTGGCCACOC ACAGOGCCTA TCTCAAAG AC G ACCAGCGCA GCACTCCGGA CAGCACATAC 660 

20 AGCGAGAGCT TCAAGG ACGC AGCCACAGAG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 
GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 
TCTCTCCGTC AGAAGCAGGG GGAGGGCCCC ATGACCTACC TCAACAAAGG ACAGTTCTAT 840 
GCCATAACAC TCAGCGAOAC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 

„ AGGAGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA GAGATGAACA GCTCAAATAC 960 

25 TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 " 
TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTCC 1080 
TTTACCTGGG ACGTGAATGA AGAGGCGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1340 
OATTTCTCCT CCCAAAAAGG GGTGAAAGGA CTTCCTTTGA TGATTCAGAT TGACACATAC 1200 
AGTTATAACA ATCGTAGCAA TAAACCCATT CATAGAGCTT ATTGCCAGAT CAAGGTCTTC 1260 

30 TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 
GGG AAAGGCC AGGCCTCCCA AACTCAATGC AACAGCTCCT CTGATGGGAA GTTGGCTGCC 1380 
ATACCTTTAC AGAAGAAGAG TGACATCACC TACTTCAAAA CCATGCCTGA TCTCCACTCA 1440 
CAGCCAGTTC TCTTCATACC TG ATGTTCAC TTTGCAAACC TGCAGAGGAC CGG ACAGGTG 1500 
TATTACAACA CGG ATGATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 

35 CCCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 
CGAGTGCTCT TGTACGTGAG G AAGGAGACT GACGATGTGT TCGATGCATT GATGTTGAAG 1680 
TCTCCCACAG TGATGGGCCT G ATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 
AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTGAACAT GGATGACAAC 1800 
. rt ATCATCGAGC ACTACTCGAA CGAGGACACC TTCATCCTCA ACATGGAGAG CATGGTGGAG I860 

40 GGCTTCAAGG TCACGCTCAT GGAAATCIAQ CCCTGGGTTT GGCATCCGCT TTGGCTGGAG 1920 
CTCTCAGTGC GTTCCTCCCT GAGAGAGAC A GAAGCCCCAG CCCCAGAACC TGGAGACCCA 1980 
TCTCCCCCAT CTCACAACTG CTGTTACAAG ACCGTGCTGG GGAGTGGGGC AAGGGACAGG 2040 
CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCACCTAC CACGGAGCCG AAGCCTGAGC 2100 
CCCTCAGGAA GGTGCCTTAG GCCTGTTGGA TTCCTATTTA TTGCCCACCT 1TTCCTGGAG 2160 

45 CCCAGGTCCA GGCCCGCCAG GACTCTGCAG GTCACTGCTA GCTCCAGATG AGACCGTCCA 2220 
GCGTTCCCCC TTCAAGAGAA ACACTCATCC CGAACAGCCT AAAAAATTCC CATCCCTTCT 2280 
TTCTCACCCC TCCATATCTA TATCTCCCGA GTCGCTGG AC AAAATO AGCT ACGTCTGGGT 2340 
GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TGCCCACTTT CTGGTCAGAC ACCTTTAGGT 2400 
TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT CCCAGCAAGT GGCCACCAGG 2460 

50 CCTTGTACAG GAAGACATTC AGTCACCGTG TAATTAGTAA CACAGAAAGT CTGOCTGTCT 2520 
GCATTGTACA TAGTGTTTAT AATATTGTAA TAATATATTT TACCTGTGGT ATGTGGGCAT 2580 
GTTTACTGCC ACTGGCCTAG AGG AGACACA GACCTGGAGA CCGTTTTAAT GGGGGTTTTT 2640 
GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGCCTTTGG GATGTTAAGG 2700 

^_ TGACTGCAGC TG ATGCCAAG ATGGACTCTG CAATGGGCAT ACCTGGGGGC TCGTTCCCTG 2760 

55 TCCCCAGAGG AAGCCCCCTC TCCTTCTCCA TGGGCATGAC TCTCCTTCGA GGCCACCACG 2820 
TTTATCTCAC AATG ATGTGT TTTGCCTGAC TTTCCCTTTG CGCTGTCTCG TGGG AAAGGT 2880 
CATTCTGTCT GAGACCCCAG CTCCTTCTCC AGCTTTGGCT GCGGGCATGG CCTGAGCTTT 2940 
CTGGAGAGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 
TCCTTGGCTA TCAGGAGAAT CCTGGACACT GTACTGTGCC TCCCAGTTTA CAAAQACGCC 3060 

60 CTTCATCTCA AGTGGCCCTT TAA AAGGCCT GCTGCCATGT GAGAGCTOTG AACAGCTCAG 3120 
CTCTGAGTCG GC AGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT $180 
GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAGATACC 3240 
TGGCTCCTGT GAAACCAGCC TCAGGAGGGA AACTGGGAGA GAGAAGCTGT GGTCTCCTGC 3300 
- TACATGCCCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATGATGAAC 3360 

65 CATCATGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT CGCCCTTGTG 3420 
GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATCCTGC ACCCAAAGGT TCTCTACAAA 3480 
GGCCCAGATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 
TCCTGCGCTC CCACCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 
_ GCTCAGCTOT TTCTCCTTGA GGTTGCGG AG GAATTGAATT GAATGGGACA GAGGGCAGGT 3660 

70 GCTGTGGCCA AGAAGATCTC CG AGC AGC AG TGACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 
GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCGTG GATCCAGCTG TGCTCCAGTC 3780 
TGTCCCCTCC TCCTCCACTC TG ACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 
GGGTACTAAT GGGGCTCTGT TCTG AGATGG ACAAATTC AG TOTTGG AAAT ACATGTTGTA 3900 
CTATGCACTT CCCATGCTCC TAGGGTTAGG AATAGTTTCA AACATG ATTG GCAG ACATAA 3960 

75 CAACOGCAAA TACTCGOACT GGGGCATAGG ACTCCAGAGT AGGAAAAAGA CAAAAOATTT 4020 
GGCAGCCTG A CACAGGCAAC CTACCCCTCT CTCTCCAGCC TCTTTATGAA ACTGTTTGTT 4080 
TGCCAGTCCT GCCCTAAGGC AGAAGATGAA TTGAAGATGC TGTGC ATGTT TCCTAAGTCC 4140 
TTGAGCAATC ATGGTGGTG A CAATTGCCAC AAGGGATATG AGGCCAGTGC CACCAGAGGG 4200 
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TGGTGCCAAG TGCCACATCC CTTCCG ATCC ATTCCCCTCT GTATCCTCGO AGCACCCCAG 4260 
TTTGCCTTTG ATGTGTCCGC TGTGTATGTT AGCTGAACTT TGATGAGCAA AATTTCCTGA 4320 
GCGAAACACT CCAAAGAGAT AGGAAAACTT GCCGCCTCTT CTTTTTTGTC CCTTAATCAA 4380 
- ACTCAAATAA GCTTAA AAAA AATCCATGGA AOATCATGOA CATGTGAAAT GAGCATTTTT 4440 
5 TTCTTTTCTT 1111111111 TTTTTTTAAC AAAGTCTGAA CTG AACAGAA CAAOACTTTT 4500 
TCCTCATACA TCTCCAAATT GTTTAAACTT ACTTTATGAG TGTTTGTTTA GAAGTTCGGA 4560 
CCAACAGAAA AATGCAGTCA GATGTCATCT TOG AATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCCCTGCCCA GAAACTTAGG AAGCATGAAA TAAATCAAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATOCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
1U CTTAAAGACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TTT 



15 
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SEQ ID KO:2BCU4 Protein sequence: 
Protein Accession *: NP.079191.1 



1 11 21 31 41 51 
I I I I I I 

M5QESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMDNGDEDS 60 
AAALQLLYDY YKVPRDKRLL SVSKASDSQE DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
20 PVNLSLNQDH LENSKREQYS ISFPESS All PVSGITWKA EDFTPVFMAP PVHYPRGDGE 180 

EQRVVIFEQT QYDVPSLATH SAYLKDDQRS TPDSTYSESF KDAATEKFRS ASVGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGEGPMTYL NKGQFYAITL SETGDNKCFR HPISKVRS W 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FNTIGNffiH AYNAVSFTWD 360 
_ VNEEAKIFIT VNCLSTDFSS QKGVKGLPLM IQIDTYSYNN RSNKPIHRAY CQDCVFCDKG 420 

25 AERKIRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGS VL VKRMFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKETDDVF D ALMLKSPTV MGLMEAISEK YGLPVEKIAK LYKKSKKGIL VNMDDNIIEH 600 
YSNEDTFTLN MESMVEGFKV TLMEI 



SEQ ID N0:3 BCU7 DNA SEQUENCE VARIANT 1: 

Nucleic Add Accession #: AA428062 

Coding sequence: 1-777 (entire sequence represents open reading frame) 

1 11 21 31 41 51 

I I 1 i I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACC6TCQTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ © HQ:4 BCU7DNA SEQUENCE VARIANT 2: 

Nucleic Add Accession*: AA428062 

Coding sequence: 1-777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

*n I I 1 1 ! 1 

OU ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

70 TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID NO:S BCU7 Protein aeoiiancs Variant 1: 
O Protein Accession *: none 

1 U 21 31 41 51 

I 1 I 1 I I 

HXAISAVSSA LLF6LLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
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YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRP 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCFGPMC THYTQMVWAT 180 

SNRIGCAIHA CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

seq id NO:6gcu7Pnilgln8Wvenc9Yarlflrrt2; 

Protein Accession*: none 



1 11 21 31 41 51 

I I I I I I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRP 120 
LGQNLSVRTG RYRSILQLVK PWYDEVKDYA PPYPQDCNPR CPMRCPGPMC THYTQMVWAT 180 
SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWFK 

SEQ ID N0:7 BCX2 DMA SEQUENCE 

Nucleic Acid Accession «: NMJH3014 

Coding sequence: 238-1278 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I 1 I 

GGCGGGTTCG CGCCCCG AAG GCTGAO AGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 
CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGG AGT TTGGGGGAAG 120 
AAACTCTCCT GCGCCCCAG A AG ATTTCTTC CTCGGCG AAG GG ACAGCGAA AG ATG AGGGT 1 80 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGOGAGAGGG CAGTGCCAJS 240 
TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 
GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 
ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 
GAGGAGCTGG TGGACGTGAA CTGCAGOGCC GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 
GCGOCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 
CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 
AGCCTGGCCT GCGACG AGCT GCCTGTCTAT OACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 
ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 
AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTCGT GGATGTAAAA 900 
GAGATCTTC A AGTCCTCATC ACCCATCCCT CG AACTC AAG TCCCGCTCAT TACAAATTCT 960 
TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 
CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 
AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 
GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 
AACCCGAAAA GAGTGTjGAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTCCTTACAG 1320 
GATGAGGCTG GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCOCCT TGCCCTAACA 1380 
ACTCACTGCA GTGCTCTTCA TAG ACACATC TTGCAGCATT TTTCTTAAGG CTATGGTTCA 1440 
Gl UilCl 11 GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 
GAGTTAAAGC TGOTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 
CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 1620 
AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTG ATGTTTT AAAATGTGAT G AAAATATAA TGTTTTTAAG 1740 
AAGG AAC AGT AGTGG AATG A ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1 800 
TITM GTG AT GAAAGGGG AT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
TGTGTTTTTT TACCAATGAC TTCAGTTTCT GTTTTTAGCT AG AAACTTAA AAACAAAAAT 1920 
AATAATAAAG AAAAATAAAT AAAAAGGAGA GGCAGACAAT GTCTGGATTC CIGTTTTTTG 1980 
GTTACCTG AT TTCCATG ATC ATG ATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAG AA 2040 
ACAGTGAGTT TGTCTOTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 
AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 2280 
AAAGTTG AGT TCCACCTCTG AAATGAGAAT TACTTGACAG TTGGGATACT TTAATCAGAA 2340 
AAAAAG AACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 
AGGC ATTC AA TAAATGCACA ACGCCCAAAG G AAATAAAAT CCTATCTAAT CCTACTCTCC 2520 
ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTG A ACAAATAAAA CTAGG AACCT GTATACATGT GTTTCATAAC 2640 
CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TA AACATCTC ACCGGAATTC 



SEQ ID N0:8 BCX2 Protein saouencg 
■ Protein Accession #: NP_003005.1 

1 11 21 31 41 51 
I I I I I I 

MFLSILVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNTT RMPNHLHHST QENAILAIEQ 60 
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YEELVDVNCS AVLRFFFCAM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVCISPE ArVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYSYVIH AKIKAVQRSG CNEVTTWDV KE1FKSSSPI PRTQVPUTN 240 
SSCQCPHILP HQDVUMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER LQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

SEQ ID N0;9 CBK1 DNA SEQUENCE 

Nucleic Acid Accession*: NM.032391 

Coding sequence: 129-302 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

t I I I I I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTOGGAGGC TOAAACCTTT 60 

AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 120 

GAACAQC GAT GTTGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTACCTCCA 1B0 

AGAGTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCGAGACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTOGGGAGG CCGAGGCAGG AAGATTCCTT 300 

QAGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTGTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG G 



SEQ ID KO:10 fflffl Protein sequence; 
Protein Accession #: NPJ 15767 



1 11 21 31 41 51 

I I I I I I 

MLCAHFSDQG PAHLTTSKSA FLSNKKTSTL KHLLGETRSD GSACNSGISG GRGRKIP 

SEQ ID N0:11 CKA1 DNA SEQUENCE 

Nudetc Add Accession*: NM_0201B2 

Coding sequence: 96-854 (underlined sequences correspond to start and stop codons) 



1 11 21 

1 I I 

TCCTTGGGTT CGGGTGAAAG CGCCTGGGGG 
AACTGAAGGC GGACAGTCTC CTGCGAAACC 
TCATCATCAT CGTGGTGGTG ATGATGGTGA 
ACTACAAGCT GTCTGCACGG TCCTTCATCA 
ATGCCCTGTC CTCAGAAGGA TGCCTGTGGC 
TCCCAGAGCC GCAGGTCTAC GCCCCGCCTC 
TCGCCCAGCG GGAGCGCTTC CACCGCTTCC 
TCGACCTGCC ACCCACCATC TCGCTGTCAG 
CCTGCACCCT CCAGCTTCGG GACCCCGAGC 
GCGCACCCCC AAACAGAACC ATCTTCGACA 
GCCCCTGCCC CCCCAGCAGT AACTCGGGCA 
GCATGGAGGG GCCGCCGCCC ACCTACAGCG 
TCCAGCACCA GCAGAGCAGT GGGCCGCCCT 
CACACATCGC GCCCCTAGAG AGCGCAGCCA 
GACACCCTCT CTAGGGTCCC CAGGGGGGCC 
ACACTCCGCG CTTCTTAGAA GAGGAGTGAG 
GTGGCCCTCC CCTCCCACCT CCCTGTGTAT 
GCACAAGCTA AGAGAGCTTG CAAAAAAAAA 
TTTGTTGAGC TGTGTCTTGA AGGCAAAAGA 



31 41 51 

1 I t 

TTCGTGGCCA TGATCCCCGA GCTGCTGGAG 60 

AGGCAATGGC GGAGCTGGAG TTTGTTCAGA 120 

TGGTGGTGGT GATCACGTGC CTGCTGAGCC 180 

GCCGGCACAG CCAGGGGCGG AGGAGAGAAG 240 

CCTCGGAGAG CACAGTGTCA GGCAACGGAA 300 

GGCCCACCGA CCGCCTGGCC GTGCCGCCCT 360 

AGCCCACCTA TCCGTACCTG CAGCACGAGA 420 

ACGG6GAGGA GCCCCCACCC TACCAGGGCC 480 

AGCAGCTGGA ACTGAACCGG GAGTCGGTGC 540 

GTGACCTGAT GGATAGTGCC AGGCTGGGCG 600 

TCAGCGCCAC GTGCTACGGC AGCGGCGGGC 660 

AGGTCATCGG CCACTACCCG GGGTCCTCCT 720 

CCTTGCTGGA GGGGACCCGG CTCCACCACA 780 

TCTGGAGCAA AGAGAAGGAT AAACAGAAAG 840 

GGGCTGGGGC TGCGTAGGTG AAAAGGCAGA 900 

AGGAAGGCGG GGGGCGCAGC AACGCATCGT 960 

AAATATTTAC ATGTGATGTC TGGTCTGAAT 1020 

AAGAAAAAAG AAAAAAAAAA ACCACGTTTC 1080 

AAAAAAATTT CTACAGTAAA AAAAAAAAAA 1140 



SEQ 10 N0:12 CHA1 Protein sequence: 
Protein Accession «: NP.064567 

1 11 21 31 41 51 

I I I I I I 

MAELEFVQII IIVWMMVMV WITCLLSHY KLSARSFISR HSQGRRRBDA LSSBGCLWPS 60 
ESTVSGNGIP EPQVYAPPRP TDRLAVPPFA QRBRFHRFQP TYPYLQHEID LPPTISLSDG 120 
EBPPPYQGPC TLQLRDPEQQ LELNRESVRA PPNRTIFDSD LMDSARLGGP CPPSSNSGIS 180 
ATCYGSGGRM EGPPPTYSEV IGHYPGSSFQ HQQSSGPPSL LEGTRLHHTH IAPLBSAAXW 240 
SKEKDKQKGH PL 

SEQ 10 N0:13 CJA5 DNA SEQUENCE 

Nucleic Acid Accession #: NM.012445 

Coding sequence: 276*1271 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

305 
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GCACGAGGGA AGAGGGTGAT CC6ACCC66G GAAGGTCGCT GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC CCGCCGCCCC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT CCTATCTGCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCOGGCCTCG GGCTTAAATA GGAGCTCCGG GCTCTGGCTG GGACCCGACC 240 

GCTGCCGGCC GCGCTCCCGC TGCTCCTGCC GGGTGATGGA AAACCCCAGC CCGGCCGCCG 300 

CCCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCOGCC GGOCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT GCTGGGGGCC GCGCATAGCT CCGACTACAG CATGTGGAGG AAGAACCAGT 540 

ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCG CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GCGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGOCCGCCG 660 

TCCCCAGCGG CACCGGGCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCGTTTGT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTGGGCGTG GACAGCCTGG 780 

ACCTGTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGOG TGACACTGGT GCGGCTGCGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTGC CCAGCAGGGA CAATGAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGCCG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA CGGGAGCCCC TGCCCCGAGC TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGTC TA AGACCAGAGC CCCGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GGCTCCTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCGTGTCCCG 1500 

TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCTGC TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTCCCGAG GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



SEQ ID NO:14 CJA5 Protein sequence: 
Protein Accession*: NP.036577 

1 11 21 31 41 51 

I I I I I I 

MENPSPAAAL GKALCALLLA TLGAAGQPLG GSSICSARAP AKYSITFTGK WSQTAPPKQY 60 

PLPRPPAOWS SLLGAAHSSD YSMWRKKQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 

HAVFSAPAVP SGTGQTSAEL EVQRRHSLVS FWRZVFSPD WFVGVDSLDL CDGDRWREQA 180 

ALDLYPYDAG TDSGPTFSSP NFATIPQDTV TBITSSSPSH PANSFYYPRL KALPPIARVT 240 

LVRLRQSFRA FIPPAFVLPS RDNEIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 
RTRYVRVQPA NNGSPCPELB EEAECVPDNC V 



SEQ ID N0:15 LBH9 OKA SEQUENCE 

Nucleic Acid Accession*: NMJ02391 

Coding sequence: 26457 (underlined sequences correspond to start and stop codcns) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTOCTCC TCACCCTCCT 60 

CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGtGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACOCCCA AGACCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



SEQ (D N0:16jBH? Protein swMepce; 
Protein Accession #: NP.002382 

l 11 21 31 41 

I I I I I 

MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK 
CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA 
RVTKPCTPKT KAKAKAKKGK GKD 



51 



DCGVGFREGT 60 
RYNAQCQETI 120 
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SEQ ID NO:17 L£M9 DMA SEQUENCE 

Nucleic AcW Accession* NMJJ05244 

Coding sequence: M61 7 (underlined sequences correspond to start and stop codorts) 



1 11 21 31 41 51 

I I I I I i 

ATGGTAGAAC TAGTGATCTC ACCCAOCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGC7GACGC TGCTGTGTGG ACTCTGAGTG ACAGACAAGG CATCACCAAA 120 

TCGGCCCCCC TGAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAJGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA GCACCTCACC CACTGGACAG AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACCC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT CCCGGCCAGC AGCATCTGCC CTTCGCCCCT CTCCACGTCC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGQAAGGA CACCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCTGACCCA CTCCCTGAAG GCACTAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA AGTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGGAGTA TTTATAG 

SEQ tO NO:18 LEM9Pro!dn sequence: 
Protein Accession #: NP .005235 



1 11 21 31 41 51 

I 1 I I 1 I 

HVELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLF SRSCPRVLPR 60 

QPSTAMAAYG QTQYSAGIQQ ATPYTAYPPP AQAVGIPSYS IKTEDSLNHS PGQSGFLSYG 120 

SSPSTSPTGQ SPYTYQMHGT TGFYQGGNGL GNAAGFGSVH QDYPSYPGFP QSQYPQYYGS 180 

•SYNPPYVPAS "SICPSPLSTS TYVLQEASHN 'VFNQSSESLA GEYNTHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII IFHSLLTGTF ASRYGKDTTT 300 

SVRIGLHMEE MIFNLADTHL FFNDLEDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GANLCLGSGV HGGVDWMRKL AFRYRRVKEM YNTYKNNVGG L.IGTPKRETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRPN CVNVLVTTTQ LIPALAKVLL YGLGSVFPIE NIYSATKTGK 480 
ESCFERIKQR FGRKAVYWI GDGVBEEQGA KKHNMPFWRI SCHADLEALR HALBLEYL 



SEQ 10 N0;19 0AA1 DNA SEQUENCE 

Nucldc Acid Accession ft NM.002740 

Coding sequence: 178-1968 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

f I I I I I 

CCGCGGTTCC GGCTGCTCCG GCGAGGCGAC CCTTGGGTCG GCGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCCCCCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACAGCAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG TGTTCCCTTG TGTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATCCATCT ACCGTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG GCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTGCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GCTTGTTAAT GATGATGAGG ATATTQATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 
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10 

15 



35 
40 



ATCAGTCTAG CATTAAATTA TCTTCATGAG CGAGGGATAA TTTATAGAGA TTTGAAACTG 1320 

GACAATGTAT TACTOQACTC TGAAOGCCAC ATTAAACTCA CTGACTACGQ CATGTGTAAG 1380 

GAAGGATTAC GGCCAGGAGA TACAACCAGC ACTTTCTGTG GTACTCCTAA TTACATTGCT 1440 

CCTQAAATTT TAAGAGGAGA AGATTATGGT TTCAGTGTTG ACT66T6GGC TCTTGGAGTG 1500 

CTCATGTTTG AGATGATGGC AGGAAGGTCT CCATTTGATA TTGTTGGGAG CTCCGATAAC 1560 

CCTGACCAGA ACACAGAGGA TTATCTCTTC CAAGTTATTT TGGAAAAACA AATTC6CATA 1620 

CCACGTTCTC TGTCTGTAAA AGCTGCAAGT GTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1680 

AAGGAACGAT TGGGTTGTCA TCCTCAAACA GGATTTGCTG ATATTCAGGG ACACCCGTTC 1740 

TTCCGAAATG TTGATTGGGA TATGATGGAG CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

AATATTTCTG GGGAATTTGG TTTGGACAAC TTTGATTCTC AGTTTACTAA TGAACCTGTC 1860 

CAGCTCACTC CAGATGACGA TGACATTGTG AGGAAGATTG ATCAGTCTGA ATTTGAAGGT 1920 

TTTGAGTATA TCAATCCTCT TTTGATGTCT GCAGAAGAAT GTGTCTGATC CTCATTTTTC 1980 

AACCATGTAT TCTACTCATG TTGCCATTTA ATGCATGGAT AAACTTGCTG CAAGCCTGGA 2040 

TACAATTAAC CATTTTATAT TTGCCACCTA CAAAAAAACA CCCAATATCT TCTCTTGTAG 2100 

ACTATATGAA TCAATTATTA CATCTGTTTT ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

TCCAGACAAT CATGTCAAAA TTTAOTTGAA CTGGTTTTTC AGTTTTTAAA AGGCCTACAG 2220 
ATGAGTAATG AAGTTACCTT TTTTGTTTAA AAAAAAAAAA G 



20 SEQ ID Nfr20 OAA1 Protein Beouence: 
Protein Accession*: NP.O02731 

1 11 21 31 41 51 

0 c I I I I I I 

2. J MSRTVAGGGS GDHSHQVRVK AYYRGDIMIT HPEPSISPEG LCNEVRDMCS FDNEQLFTMK 60 
WIDEEGDPCT VSSQLELEEA PRLYELNKDS ELLIHVFPCV PERPGMPCPG BDKSIYRRGA 120 
RRWRKLYCAN GHTFQAKRFN RRAHCAICTD RIWGLGRQGY KCINCKLLVH KKCHKLVTIE 180 
CGRHSLPQEP VMFMDQSSMH SDHAQTVIPY NPSSHESLDQ VGEEKEAMNT RESGKASSSL 240 
_ _ GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI YAMKWKKEL VNDDEDIDW QTEKHVFEQA 300 
30 SNHPFLVGLH SCFQTESRLF FVIEYVNGGD LHFHMQRQRK LPEEHARFYS AEISLALNYL 360 
HERGIIYRDL KLDNVLLDSE GHIKLTDYGM CKEGLRPGDT TSTFCGTPNY IAPEILRGED 420 
YGFSVDWWAL GVLMFEMMAG RSPFDIVGSS DNPDQNTEDY LFQVILEKQI RIPRSLSVKA 480 
ASVLKSFLNK DPKERLGCHP QTGFADIQGH PFFRNVDWDM MEQKQWPPF KPNISGEFGL 540 
DNFDSQFTNE PVQLTPDDDD IVRKIDQSEP EGFEYINPLL MSAEECV 



8£QIDK0:21 08H2 DMA SEQUENCE 

Nucleic Add Accession*: 145628 

Coding sequence: 1 97-4782 (underlined sequences correspond to start and stop codons) 



l n 21 31 41 51 

I I I I I 1 

- CCAGGCGGCG TTGCGGCCCC GGCCCCGGCT CCCTGCGCCG CCGCCGCCGC CGCCGCCGCC 60 

45 GCCGCCGCCG CCGCCGOCAG CGCTAGCGCC AGCAGCCGGG CCCGATCACC OGCCGCCCGG 120 

TGCCCGCCGC CGCCCGCGCC AGCAACCGGG CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC ACCGGCATGG CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACGT GGAATACCAG CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

„ CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTCTGGGCC TGTTTCCCCT TCTACTTCCT 360 

50 CTATCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 420 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACCACGCTG CTTGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

cc AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTGT GCCCTAGCCA TCCTGAGATC 660 

55 CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC TTTTCCCTCT TACTCATTCA GCTCGTCTTG TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC CCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACCGCCAGCC 900 

- n CCTGGAGGGC AGTGACCTCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

60 TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGTACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAGT TCCAAGGTGG ATGCGAATGA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTT CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

„ CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

65 CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG CTGCTGTTTG TCACTGCCTG 1320 

CCTGCAGACC CTCGTGCTGC ACCAGTACTT CCACATCTGC TTCGTCAGTG GCATGAGGAT 1380 

CAAGACCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT IS 00 

GGACTTGGCC ACGTACATTA ACATGATCTG GTCAGCCCCC CTGCAAGTCA TCCTTGCTCT 1560 

70 CTACCTCCTG TGGCTGAATC TGGGCCCTTC CGTCCTGGCT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TGGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

__, GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

75 CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CCTTCGTGTC TTTGGCCTTG TTCAACATCC TCCGGTTTCC 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTGTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

on CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGCCAG 2160 

80 GAGCGACCCT CCCACACTGA ATGGCATCAC CTTCTCCATC CCCGAAGGTG CTTTGGTGGC 2220 
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CGTGGTGGGC CAGGTGGGCT GCGGAAAGTC QTCCCTGCTC TCAGCCCTCT TGGCTGAGAT 2280 

GGACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTCCGTG GCCTATGTGC CACAGCAGGC 2340 

CTGGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT OGATOTCAGC TGGAQQAACC 2400 

ATATTACAGG TCCGTGATAC AGGCCTGTGC CCTCCTCCCA GACCTGGAAA TCCTGCCCAG 2460 

5 TGGGGATCGG ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC AGAAGCAGCG 2520 

CGTGAGCCTO GCCCGGGCCG TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAQTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATGAGCTACT TGCCGCAGGT 2700 

n GGACGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 2760 

10 GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCOTACC TATGCCAGCA CAGAGCAGGA 2820 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGGGAAG CAACTGCAGA GACAGCTCAG 2940 

CAGCTCCTCC TCCTATAGTG GGGACATCAG CAOGCACCAC AACAGCACCG CAGAACTGCA 3000 

_ GAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAGAC 3060 

15 AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTCC GCGCTGGCTT CCAACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

20 CATCCTGCGG TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATCCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCCCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGCCA CGCOCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

- TTCCTCCCGG CAGCTGAAGC GCCTCGAGTC GOTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

25 CAACGAGACC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

30 AACCAACATC . GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

c CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

35 CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTTCCCTCCG AATGAACCTG GACCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGGGGAG AACCTCAGTG TCGGGCAGCG 4500 

ft CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 4560 

40 GGCCACGGCA GCCGTGGACC TGCAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AACACCATCA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGT GT GAG CCOCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

45 CCCCTGGTAA ACCAAGCCTC -CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAACCAC C 

50 SEQ ID WQ:22 OBH2 Protein seouence: 
Protein Accession*: AAB46618 

1 11 21 31 41 51 

55 MALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLVW VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL DWIVCWADLF YSFWSRSRGI FLAPVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIKLT FWLVALVCAL AILRSKIMTA LKEDAQVDLF RDITFYVYFS 180 

LLLIQLVLSC FSDRSPLFSE TIHDPNPCPE SSASFLSRIT FWWITGLIVR GYRQPLEGSD 240 

LWSLNKEDTS EQWFVLVKN WKKECAKTRK QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

60 IVKSPQKEWN PSLFKVLYKT FGPYPLMSPF FKAIHDLMMF SGPQILKLLI KFVNDTKAPD 360 

WQGYFYTVLL FVTACLQTLV LHQYFHTCFV SGMRIKTAVI GAVYRKALVI TNSARKSSTV 420 

GEIVHLHSVD AQRFMDLATY INMIWSAPLQ VILALYLLWL NLGPSVLAGV AVHVLMVPVN 480 

AVMAMKTKTY QVAHMKSKDN RIKLMNEILN GIKVLKLYAW BLAFKDKVLA IRQEELKVLK 540 

KSAYLSAVGT FTWVCTPFLV ALCTFAVYVT IDENNILDAQ TAFVSLALFN ILRFPLNILP 600 

65 MVISSIVQAS VSLKRLRIFL SHEELEPDSI ERRPVKDGGG TNSITVRNAT FTWARSDPPT 660 

LNGITFSIPE GALVAWGQV GCGKSSLLSA LLAEMDKVEG HVAIKGSVAY VPQQAWIQND 720 

SLRENILFGC QLEEPYYRSV IQACALLPDL EILPSGDRTE IGEKGVNLSG GQKQRVSLAR 780 

AVYSNADIYL FDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHSMS YLPQVDVIIV 840 

HSGGKISEMG SYQELLAKDG AFAEFLRTYA STEQEQDAEE NGVTGVSGPG KEAKQMENGM 900 

70 LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEETWKLMEA DKAQTGQVKL 960 

SVYWDYMKAI GLFISFLSIF LFMCNHVSAL ASNYWLSLWT DDPIVNGTQE HTKVRLSVYG 1020 

ALGISQGIAV FGYSMAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG NLVNRFSKEL 1080 

DTVDSMIPEV IKMFMGSLFN VIGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVIRAFEE QERFIHQSDL KVDENQKAYY PSIVANRWLA 1200 

75 VRLECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSI/QV TTYLNWLVRM SSEMETNIVA 1260 

VERLKEYSET EKEAPWQIQE TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHXHVTZKGG 1320 

EKVGIVGRTG AGKSSLTLGL FRINESAEGE II IDG IN I AK IGLHDLRFKI TIIPQDPVLF 1380 

SGSLRMNLDP FSQYSDEEVW TSLELAHLKD FVSALPDKLD HECAEGGENL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQPEDCT VLTIAHRLNT IMDYTRVXVL 1500 

80 DKGEIQEYGA PSDLLQQRGL FYSMAKDAGL V 
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SEQ ID N023 PAA2 ONA SEQUENCE 

_ Nucleic Acid Accession*: NM.013309 

J Coding sequence: M290 (underlined sequences correspond to start and stop codons) 



10 
15 
20 
25 
30 



45 



75 
80 



1 11 21 31 41 51 

I I I I I I 

ATGGCCGGCT CTGGCGCGTG GAAGCGCCTC AAATCTATGC TAAGGAAGGA TGATGCGCCG 60 

CTGTTTTTAA ATGACACCAG CGCCTTTGAC TTCTCGGATG AGGCGGGGGA CGAGGGGCTT 120 

TCTCGGTTCA ACAAACTTCG AGTTGTGGTG GCCGATGACG GTTCCGAAGC CCCGGAAAGG 180 

CCTGTTAACG GGGCGCACCC GACCCTCCAG GCCGACGATG ATTCCTTACT GGACCAAGAC 240 

TTACCTTTGA CCAACAGTCA GCTGAGTTTG AAGGTGGACT CCTGTGACAA CTGCAGCAAA 300 

CAGAGAGAGA TACTGAAGCA GAGAAAGGTG AAAGCCAGGT TGACCATTGC TGCCGTTCTG 360 

TACTTGCTTT TCATGATTGG AGAACTTGTA GGTGGATACA TTGCAAATAG CCTAGCAATC 420 

ATGACAGATG CACTTCATAT GTTAACTGAC CTAAGCGCCA TCATACTCAC CCTGCTTGCT 4 B0 

TTGTGGCTAT CATCAAAATC ACCAACCAAA AGATTCACCT TTGGATTTCA TCGCTTAGAG 540 

GTTTTGTCAG CTATGATTAG TGTGCTGTTG GTGTATATAC TTATGGGATT CCTCTTATAT 600 

GAAGCTGTGC AAAGAACTAT CCATATGAAC TATGAAATAA ATGGAGATAT AATGCTCATC 660 

ACCGCAGCTG TTGGAGTTGC AGTTAATGTA ATAATGGGGT TTCTGTTGAA GCAGTCTGGT 720 

CACCGTCACT CCCATTCCCA CTCCCTGCCT TCAAATTCCC CTACCAGAGG TTCTGGGTGT 780 

GAACGTAACC ATGGGCAGGA TAGCCTGGCA GTGAGAGCTG CATTTGTACA TGCTTTGGGA 840 

GATTTGGTAC AGAGTGTTGG TGTGCTAATA GCTGCATACA TCATACGATT CAAGCCAGAA 900 

TACAAGATTG CTGATCCCAT CTGTACATAC GTATTTTCAT TACTTGTGGC TTTTACAACA 960 

TTTCGAATCA TATGGGATAC AGTAGTTATA ATACTAGAAG GTGTGCCAAG CCATTTGAAT 1020 

GTAGACTATA TCAAAGAAGC CTTGATGAAA ATAGAAGATG TATATTCAGT CGAAQATTTA 1080 

AATATCTGGT CTCTCACTTC AGGAAAATCT ACTGCCATAG TTCACATACA GCTAATTCCT 1140 

GGAAGTTCAT CTAAATGGGA GGAAGTACAG TCCAAA6CAA ACCATTTATT ATTGAACACA 1200 

TTTGGCATGT ATAGATGTAC TATTCAGCTT CAGAGTTACA GGCAAGAAGT GGACAGAACT 1260 
TGTGCAAATT GTCAGAGTTC TAGTCCCTGA 



SEQ ID N0:24 PAA2 Protein sequence: 
5 J Protein Accession #: NP.037441 

1 11 21 31 41 51 

I I I I I I 

Af . MAGSGAWKRL KSHLRKDDAP LFLNDTSAFD FSDEAGDEGL SRFNKLRVW ADDGSEAPER 60 

40 FVNGAHPTLQ ADDDSLLDQD LPE/mSQLSL KVDSCDNCSK QREILKQRKV KARLTIAAVL 120 

YLLFMIGELV GGYIANSLAI MTDALHMLTD LSAIILTLLA LWLSSKSPTK RFTFGFHRLE 180 

VLSAMISVLL VYILHGFLLY EAVQRTIHMN YEINGDIMLI TAAVGVAVNV IMGFLLNQSG 240 

HRHSHSHSLP SNSPTRGSGC ERNKGQDSLA VRAAFVHALG DLVQSVGVLI AAYIIRFKPE 300 

YXXADPICTY VFSLLVAFTT FRIIWDTWI ILEGVPSHLN VDYIKEALMK IEDVYSVEDL 360 

NIWSLTSGKS TAXVHIQLIF GSSSKWEEVQ SXANHLLLNT FGKYRCTIQL QSYRQEVDRT 420 
CANCQSSSP 



SEQ 10 N0:25 PAA3 DNA SEQUENCE 

50 Nucleic Add Accession* AB037765 

Coding sequence: 375-2798 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

55 i i i til 

GCCGAGTCGG TGGCGGCTGC AGGCTGGGAG GGAGAAGTGC TACGCCTTTG CAGGTTGGCG 60 

AAGTGGTTCC AGGCTACCCG GCTAGTCTGG CACGGCCCCG TCTTCTGCCT CCTCCTCCGT 120 

CGCGTGGCGG CGGGAACTGT TGGCCGCGCG GCCTCGGGAA CGGCCCAGGT CCCCGCCCGC 180 

/:n AGGTCCCGGG CAGATAACAT AGATCATCAG TAGAAAACTT CTTGAAGTTG TTCAAGAAAA 240 

OU ATTTGAAAGT AGCAAAATAG AAAATAAAGA ATTAACAGCA GATACAGAGG ACAGCATGGA 300 

AGTGTTGTCT TAGGAAACAG AACACAGCAG TGAAAAAACA GACAAAATCC GCTCAGATAC 360 

AACTGCAGCT QATAATGTTT TCCGGCTTCA ATGTCTTTAG AGTTGGGATC TCTTTTGTCA 420 

TAATGTGCAT TTTTTACATG CCAACAGTAA ACTCTTTAOC AGAACTGAGT CCTCAGAAAT 480 

- c ATTTTAGTAC ATTGCAACCA GGTCTTGAAG AACTGAATGA GGCTGTTAGA CCTCTGCAGG 540 

05 ACTATGGAAT TTCAGTTGCC AAGGTTAATT GTGTCAAAGA AGAAATATCA AGATACTGTG 600 

GAAAAGAAAA GGATTTGATG AAAGCATATT TATTCAAGGG CAACATATTG CTCAGAGAAT 660 

TCCCTACTGA CACCTTGTTT GATGTGAATG CCATTGTCGC CCATGTTCTC TTTGCTCTTC 720 

TTTTTAGTGA AGTGAAATAT ATTACCAACC TGGAAGACCT TCAGAACATA GAAAATGCTC 780 

TGAAAGGAAA AGCAAATATT ATATTCTCAT ATGTAAGAGC CATTGGAATA CCAGAGCACA 840 

7U GAGCAGTCAT GGAAGCCGGT TTTGTGTATG GGACTACATA CCAATTTGTC TTAACCACAG 900 

AAATTGCCCT TTTGGAAAGT ATTGGCTCTG AGGATGTGGA ATATGCACAT CTCTACTTTT 960 

TTCATTGTAA ACTAGTCTTG GACTTGACCC AGCAATGTAG AAGAACACTA ATGGAACAGC 1020 

CATTGACTAC ACTGAACATT CACCTGTTTA TTAAGACAAT GAAAGCACCT CTGTTGACTG 1080 

AAGTTGCTGA AGATCCTCAA CAAGTTTCAA CTGTCCATCT CCAACTGGGC TTACCACTGG 1140 

TTTTTATTGT TAGCCAACAG GCTACTTATG AAGCTGATAG AAGAACTGCA GAATGGGTTG 1200 

CTTGGCGTCT TCTGGGAAAA GCAGGAGTTC TACTCTTGTT AAGGGACTCT TTGGAAGTGA 1260 

ACATTCCTCA AGATGCTAAT GTGGTCTTCA AAAGAGCAGA AGAGGGAGTT CCAGTGGAAT 1320 

TTTTGGTATT ACATGATGTT GATTTAATAA TATCTCATGT GGAAAATAAT ATGCACATTG 1380 

AGOAAATACA AGAAGATGAA GACAATGACA TGQAAGGTCC AGATATAGAT GTTCAGGATG 1440 

ATGAAGTGGC AGAAACTGTT TTCAGAGATA GGAAGAGAAA ATTACCTTTG GAACTTACAG 1500 
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TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAGTACTCT 1560 

TCTAtGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATT60TCT0 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAACCCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTOCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA 1860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT I960 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATTATTGA 2220 

TTTTGTTCAG TGATGGCACT GTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTCCTCTT CTTGTTTTGG 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAQ AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGATAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATATAG AAATTATTAA TGAGATATTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 

ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGCGTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAQ CTAAAACTCA GTAAGTGCAT 4200 

MTATTTCCTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATTAGC CTAATTATTA 4380 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 

1 II 21 31 41 51 

I I I E t I 

KFSGFNVFRV OISPVIMCIF YMPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 

VAKVNCVKEE ISRYCGKEKD LHKAYLFKGN ILLREFPTDT LFDVNAIVAH VLFALLFSEV 120 

KYITNLEDLQ NIENALKGKA NIIFSYVRAI GIPEHRAVME AGFVYGTTYQ FVLTTEIALL 180 

ESIGSEDVEY AHLYFFHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 

PQQVSTVHLQ LGLPLVFIVS QQATYBADRR TAEWVAWRLL GKAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEB GVPVEFLVLH DVDLIISHVE NNMHIEEIQE DEDNDMEGPD IDVQDDEVAE 360 

TVFRDRKRKL PLELTVELTE ETFNATVMAS DSIVLFYAGW QAVSMAFLQS YIDVAVKLKG 420 

TSTMLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKFIQLNRIS 480 

YPVNITSIQE AEEYLSGELY KDLZLYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVITG 540 

IYSEEDVLLL STKYAASLPA LLLARHTEGK IESIPLASTH AQDIVQIITD ALLEMFPEIT 600 

VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIl* 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAIIEENLVL WLKKLEAGLE NHITILPAQE 720 

WKPPLPAYDF LSMIDAATSQ RGTRKVPKCM KETDVQENDK EQHEDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEKSFRRDKE LGCSKVN 

SEQ ID NO:27 PAA5 DNA SEQUENCE 

Nucleic Acid Accession*: NM.0 12449 

Coding sequence: 66-1085 (imdertined sequences correspond to start and slop codora) 
1 11 21 31 41 51 

I I I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAO 60 

AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC ATAAGGACAC GGGAGAGACC AGCATGCTAA 1B0 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTATTATAGC ATCTCTGACT TTTCTTTACA CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

5 CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

CAATGGTTTC CATCACTCTC TTGGCATTGG TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

TCCAACTTCA TAATGGAACC AAGTATAAGA AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCAGTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

A GTCTGTCTTA CCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

1U AGGTCCAACA AAATAAAGAA GATGCCTGGA TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG GGATTGGCAA TACTGGCTCT GTTGGCTGTG ACATCTATTC 780 

CATCTGTGAG TGACTCTTTG ACATGGAGAG AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

TTGTTTCCCT TCTACTGGGC ACAATACACG CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

. ATATAAAACA ATTTGTATGG TATACACCTC CAACTTTTAT GATAGCTGTT TTCCTTCCAA 960 

15 TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AGATTAGACA TGGTTGGGAA GACGTCACCA AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TGTAQAATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACATT 1140 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

20 SEQ ID Nfr.28 PAA5 Proton sequence 
Protein Accession ff: NP.036S81 

1 11 21 31 41 51 

OK I I I I I I 

ZD MESRKDITNQ EELWKKKPRR NLEEDDYLHK DTGETSMLKR PVLLHLHQTA HADEFDCPSE 60 

LQHTQELFPQ WHLPIKIAAI IASLTFLYTL LREVTHPLAT SHQQYFYKIP ILVTNKVLPM 120 

VSITLLALVY LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQFGLLSFP FAVLHAXYSL 180 

SYPMRRSYRY KLLNWAYQQV QQNKEDAWIE HDVWRMEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWREF HYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQFVWYTPPT FMIAVFLPIV 300 
30 VLIFKSILFL PCLRKKILKI RHGWEBVTKI NKTEICSQL 
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SEQ ID NO:29 PAA7 DMA SEQUENCE 

Nucleic Acid Accession*: NM_030774 

Coding sequence: 1-983 (underlined sequences correspond to start and stop codons) 



11 21 31 41 51 

( I I I I I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

40 AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGCCCTTT TCTGGTTTGA TTCCCGAGAG ATTAGCTTTG AGGCCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

. CGTTATGTGG CCATCTGCCA CCCACTGCGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

45 GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CTGATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGGTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 660 

cri ATACGAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

50 GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGACCCTTAA CACTACACTT CTCCTTATCT TTATTGGCTT GATAAACATA ATTATTTCTA 1020 

55 ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

- A TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

60 TTTATTATGG TTAGCTGTCA CATACAACTT TTTTTTTTTT TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGIT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAQTGTT GGGATTACAG 1620 

65 GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA 1860 

„ AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

70 TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

75 TTCCTCAGCT GTACAAATCC TCTGTTTTCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTATTGCT TGCTTTTTTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

c CTGTCAAAAA TTTTGAATGT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 

OU TAAAATTTTA TTTTAAATTT T 
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SEP ID NO;30 PAA7 PROTEIN SEQUENCE 

Protein Accession!: np_M04O1 

5 1 11 21 31 41 51 

I I I I I 1 

MSSCNFTHAT PVLIGIPGLE KAHFWVGFPL LSMWVAMFG NCIWFIVRT ERSLHAPMYL 60 

FLCMLAAIDL ALSTSTMPKI LALFWFDSRE ISPEACLTQM FFIHALSAIE STILLAMAFD 120 

in RYVAICHPLR HAAVLNNTVT AQIGIVAWR GSLFFFPLPL LIKRLAFCHS NVLSHSYCVH 180 

1U QDVMKLAYAD TLPNWYGLT AILLVMGVDV MFISLSYFLI IRTVLQLPSK SERAKAFGTC 240 

VSHIGWLAP YVPLIGLSW HRFGNSLHPI VRWMGDIYL LLPPVINPII YGAKTKQIRT 300 

RVLAHFKISC DKDLQAVGGK 

1 - SEQ ID N0:31 PAV6 DNA SEQUENCE 

15 Nucleic Acid Accession!: XM.050837 

Coding sequence: 1-1020 (underlined sequences correspond to start and stop codons) 



20 
25 
30 
35 
40 



1 11 21 31 41 51 

I I I I I I 

ATGAACTGGG AGCTGCTGCT GTGGCTGCTG GTGCTGTGCG CGCTGCTCCT GCTCTTGGTG 60 

CAGCTGCTGC GCTTCCTGAG GGCTGACGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 300 

AAAGAAAAAG ATATACTTGT TTTGCCOCTT GACCTGACCG ACACTGGTTC CCATGAAGCG 360 

GCTACCAAAG CTGTTCTCCA GGAGTTTGGT AGAATCOACA TTCTGGTCAA CAATGGTGGA 420 

ATGTCCCAGC GTTCTCTGTG CATGGATACC AGCTTGGATG TCTACAGAAA GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCC0TGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 540 

AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTTTCC 600 

ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 660 

CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATOAT 840 

TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 900 

ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACAAA ACATGACTGA 

SEQ ID H0!KPAVfffWW9WTO 
Protein Accession!: XPJ>50837 



1 11 21 31 41 51 

i I I I 1 I 

MNWEXLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

c GXGEELAYQL . SKLGVSLVLS ARRVHELEHV KRRCLENGNL KEKDILVLPL DLTDTGSHKA 120 

45 ATKAVLQEFG RIDILVNNGG MSQRSLCHDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIBR 180 

KQGKXVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTB LATYPGIIVS NICPGPVQSN 240 

IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP PLLVTYLWQY 300 
MPTWAWW1TN KMGKKRJENF KSGVDADSSY FKIFKTKHD 

50 SEQ 10 N033 P6A6 DNA SEQUENCE 

Nucleic Add Accession!: NM.006S53 

Cooing sequence: 28-874 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

55 | | | | | | 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

, A CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 240 

OU CGAGAAGACG CGGCTACTCT GTGOGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAOC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACOGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 540 

G5 CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 780 

- A CCAGGATCCO TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

/U GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA CCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 
75 TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SEP ID NO:34 PBA6 PROTEIN SEQUENCE 

Protein Accession I: NP.006M4 
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1 11 21 31 41 51 

I I I I I I 

- MRXLQLXLLA LATGLVGGET RIIKGFBCKP HSQPWQAALF BKTRLLCGAT LIAPRWLLTA 60 

5 AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT BSFPHPGFNN SLFNKDHRND IHLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST SSFQLRLPHT LRCANITIIE HQKCENAYPG 180 

NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWO QDPCAITRKP GVYTKVCKYV 240 

DWIQETMKNN 

10 SEQ ID N&35 PBC1 DMA SEQUENCE 

Nucleic Actd Accession*: NMJXH773 

Coding sequence: 70-972 (underlined sequences correspond to start and stop codons) 
,-1 11 21 31 41 51 

15 | i | | | | 

CTAAAGCTCT CTTGCTGCCT AGCCTCCTGC CGGCCTCATC TTCGCCCAGC CAACCCCGCC 60 

TGGAGCCC TA TGG CCAACTG CGAGTTCAGC CCGGTGTCCG GGGACAAACC CTGCTGCCGG 120 

CTCTCTAGGA GAGCCCAACT CTGTCTTGGC GTCAGTATCC TGGTCCTGAT CCTCGTCGTG 160 

OA GTGCTCGCGG TGGTCGTCCC GAGGTGGCGC CAGACGTGGA GCGGTCCGGG CACCACCAAG 240 

20 CGCTTTCCCG AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATG 300 

AGACATGTAG ACTGCCAAAG TGTATGGGAT GCTTTCAAGG GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTGAAGA AGACTATCAG CCACTAATGA AGTTGGGAAC TCAGACCGTA 420 

CCTTGCAACA AGATTCTTCT TTGGAGCAGA ATAAAAGATC TGGCCCATCA GTTCACACAG 480 

oc GTCCAGCGGG ACATGTTCAC CCTGGAGGAC ACGCTGCTAG GCTACCTTGC TGATGACCTC 540 

25 ACATGGTGTG GTGAATTCAA CACTTCCAAA ATAAACTATC AATCTTGCCC AGACTGGAGA 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTGGAAAA CGGTTTCCCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTGGAAGTC CATAATTTGC AACCAQAGAA GGTTCAGACA 780 

0 _ CTAGAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG CCAGGATCCC 840 

30 ACCATAAAAG AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAAGAAT 900 

ATCTACAGAC CTGACAAGTT TCTTCAGTGT GTGAAAAATC CTGAGGATTC ATCTTGCACA 960 

TCTGAGATCT QA GCCAQTCG CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGGTCCTCCA 1080 

c CAATAAGGTC AATGCCAGAG ACGGAAGCCT TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 

35 CATCAGCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 
AATGAAAATT GTATGTTAAG TTACTTCCTT TAG 



40 



SEQIDNO:36f 
Protein Accession #: NP.001766 



11 21 31 41 51 

I I I I I 1 

MANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVLILWVLA VWPRWRQTW SGPGTTKRFP 60 

. _ ETVLARCVKY TEIHPEKRHV DCQSWJDAFK GAFISKHPCN ITEEDYQPLM KLGTQTVPCN 120 

45 KILLWSRIKD LAHQFTQVQR DMFTLEDTLL GYLADDLTWC GEPNTSKINY QSCPDWRKDC 180 

SNNPVSVFWK TVSRRFAEAA CJJWHVMLNG SRSKJFDKNS TFGSVEVHNL QPEKVQTLEA 240 
WIHGGREDS RDLCQDPTIK ELESIISKRN IQFSCKNIYR PDKFLQCVKN PEDSSCTSEI 

SEQ 10 NO:37 PBHt ONA SEQUENCE 

50 Nucleic Acid Accession*: XML017718 

Coding sequence: 1-3315 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

~ I I I I t I 

JJ ATGTOCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTCGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACGAGAAT GTGTCTTCTT TACCAAAGAT 180 

TCCAAGGCCA CGGAGAATGT GTGCAAGTGT GGCTATGCCC AGAGCCAGCA CATGGAAGGC 240 

ACCCAGATCA ACCAAAGTGA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

60 GACGCCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTG GCACCTGAKA 420 

ACACCCAACC TGGTCATTTC TGTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATGCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

ACGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

65 ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGOTC 660 

TCCAACCGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

„ CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

70 ATTGTGTGTT TTGCCCAAGG AGGTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

_ - CTCAAAGAAA TTCTCGAATG TTCTCACCTA TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

75 GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC CGCCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

on GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

80 TTCAGCACGC TTGTGTACCG GAATCTGCAG ATCGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCAOGTTTQ TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTCCGGAA OGAAGACAGA 1620 

AATGGCCGGG ACGAGATGGA CATAGAACTC CACGACGTGT CTCCTATTAC TCGGCACCCC 1680 

CTGCAAGCTC TCTTCATCTG GGCCATTCTT CAGAATAAGA AGGAACTCTC CAAAGTCATT 1740 

_ TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GCCCTGGGAG CCAGCAAGCT TCTGAAGACT 1800 

5 CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGOAGCT GGCTAATGAG I860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

in CAATGGTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTGTTT 2100 

1U ATTATACCCT TGGTGGGCTG TGGCTTT6TA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGGT CTTCTCCTGG 2220 

AATGTGGTCT TCTACATCGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTOGGTGC CACACCCCCC CGAGCTGGTC CTGTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

t GATGAAGTGA GACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

15 ATGGACACGC TGGGGCTTTT TTACTTCATA GCAGGAATTG TATTTCGGCT CCACTCTTCT 2460 

AATAAAAGCT CTTTGTATTC TGGACGAGTC ATTTTCTGTC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAG GACCCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTG GATGGTGGCC 2640 

on TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

20 CGTTCGGTCA TCTACGAGCC CTACCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 2760 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCCCCGG TTCCCCGAGT GGATCACCAT CCCCCTGGTG 2880 

TGCATCTACA TGTTATCCAC CAACATCCTG CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

c TACACGGTGG GCACCGTCCA GGAGAACAAT GACCA6GTCT GGAAGTTCCA GAGGTACTTC 3000 

25 CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACAAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA OCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

orv CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
30 AATAAAATCA AATGA 

SEQ ID NO:38 PBH1 Protein sequence 
Protein Accession #: XP_0 1 77 18 

35 1 11 21 31 41 51 

I I I I I I 

MSFRAARLSM RNRRNDTLDS TRTLYSSASR STDLSYSESD LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GVAQSQHMEG TQINQSEKWN YKKHTKEFPT DAFGDIQFET LGKKGKYIRL 120 

Ari SCDTDAEILY ELLTQHWHLK TPNLVISVTG GAKNFALKPR MRK1PSRLIY IAQSKGAWIL 180 

40 TGGTHYGLMK YIGEWRDNT ISRSSBENIV AIGIAAWGMV SNRDTLIHNC DABGYFLAQY 240 

LMDDFTRDPL YILDNNHTHL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGGK ETLKAINTSI KNKIPCVWB GSGQIADVIA SLVEVEDALT SSAVKBKLVR 360 

FLPRTVSRLP EEETESWIKW LKEILECSHL LTVIKMEEAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RRWESADLQE VMFTALIKDR PKFVRLFLEN 480 

45 GLNLRKFLTH DVLTELFSNH FSTLVYENLQ IAKNSYNDAL LTFVWKLVAN PRRGFRKEDR 540 

NGRDEMDIEL HDVSPITRHP LQALFIWAIL QNKKELSKVI WECTRGCTLA ALGASKLLKT 600 

LAKVKNDINA AGESEELANE YETRAVELFT ECYSSDEDLA EQLLVYSCEA WGGSNCLELA 660 

VEATOQHFIA QPGVQNFLSK QWYGEISRDT KNWKIILCLF IIPLVGCGFV SFRKKPVDKB 720 

- KKLLWYYVAP FTSPFWFSW NWFYIAFLL LFAYVLLMDF HSVFHPPELV LYSLVFVLFC 780 

50 DEVRQWYVNG VNYFTDLWNV HDTLGLFYFI AGIVFRLHSS NKSSLYSGRV IFCLDYIIFT 840 

liRLIHIFTVS RNIX3PKIIML QRHLIDVFFF LFLFAVWMVA FGVARQGILR QNEQRWRWIF 900 

RSVIYEPYLA MFGQVPSDVD GTTYDFAHCT FTGNESKPLC VELDEHNLPR FPEWITIPLV 960 

CIYMLSTNIL LVNLLVAMFG YTVGTVQENN DQVWKFQRYF LVQEYCSRLN IPFPFIVFAY 1020 

FYMWKKCFK CCCKEKNMES SVCCFKNEDN ETLAMEGVMK ENYLVKINTK ANDTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NKIK 
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SEQ ID NO:39 PBH3 DNA SEQUENCE 

Nucleic Acid Accession* XM_011804 

Coding sequence: 1-558 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I 1 1 1 1 

ATGC CTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

AGAGCAGTCG CGGCCAAATO GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

65 CGCGCGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACOGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

_ n GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

70 GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO:4Q PBH3 PROTEIN SEQUENCE 

75 Protein Accession ft NPJW8842 

1 11 21 31 41 51 

I I I I 1 I 

HPRLFLFHLL EFCLLLNQFS RAVAAKWKDD VIKLCGRELV RAQIAZCGMS TWSKRSLSQE 60 
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DAPQTPRPVA EIVPSFINKD TETIIIMLEF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSFEEFK KLIRNRQSEA ADSNPSELKY LGLDTHSQKK RRPYVALFEK CCLIGCTKRS 180 
LAKYC 

SEO ID N0:41 PBH5 DNA SEQUENCE 

Nudeic Acid Accession* NMJ0O5845 

Coding sequence: 1-3978 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGGCT CAATCCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGACTTTTT GCACGCTCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCAG 480 

TGTGCTGGGA TGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTACACT TCCTGTGGGC AGGACCACTG 660 

CAGGCGATCG CAGTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTGCTGGG 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 640 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT GTATGGGGCT 1080 

GTGCGGCTGA CGGTTACCCT CTTCTTCCCC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC ■ TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGG ATAATGAGGA AAGTGAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAACCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TT AACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACGTTT 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

ATTCCTTGGA TCGCAATAOC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 27 00 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2880 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC GCACCACCAG CCTGGCCCCA TGAAGGAGTQ 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGTGTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCCGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession*: NP_005836 

1 11 21 31 41 51 

I I I I I I 

MLPVYQEVKP NPLQDANLCS RVFFWWLNPL FKIGHKRRLE EDDMYSVLPE DRSQHLGEEL 60 

QGFWDKEVLR AENDAQKPSL TRAIIKCYWK SYLVLGIFTL IEESAKVIQP IFLGKIINYF 120 

ENYDPMDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVQ CAGMRLRVAK CHMIYRKALR 180 

LSNMAMGKTT TGQIVNLLSN DVNKFDQVTV PLHFLWAGPL QAIAVTALLW MEIGISCLAG 240 

MAVLIILLPL QSCFGKLPSS LRSKTATFTD ARIRTMNEVI TGIRIIKMYA WEKSFSNLIT 300 

NLRKKEISKI LRSSCLRGMN LASFFSASKI IVFVTFTTYV LLGSVTTASR VFVAVTLYGA 360 

VRLTVTLPFP SAIERVSBAI VSIRRIQTFL LLDEISQRNR QLPSDGKKMV HVQDFTAFWD 420 

KASBTPTLQG LSFTVRPGEL LAWGPVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWPSGTLR SNILFGKKYE KERYEKVIKA CALKKDLQLL EDGDLTVIGD RGTTLSGGQK 540 

ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICQ ILHEKITILV THQLQYLKAA 600 

SQILILKDGK MVQKGTYTEF LKSGIDFGSL LKKDNEESEQ PPVPGTPTLR NRTFSESSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSEGKVGFQA YKNYFRAGAH WIVFIFLILL 720 

NTAAQVAYVL QDWWLSYWAN KQSMLNVTVN GGGNVTEKLD LNWYLGIYSG LTVATVLPGI 780 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR ILNRFSKDIG HLDDLLPLTF 840 

LDFIQTLLQV VGWSVAVAV IPWIAIPLVP LGIIPIFLRR YFLETSRDVK RLESTTRSPV 900 

FSHLSSSLQG LWTIRAYKAE ERCQELFDAH QDLKSEAWFL FLTTSRWFAV RLDAICAMFV 960 

IXVAPGSLIL AKTLDAGQVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVTBYTDLE 1020 

KEAPWEYQKR PPPAWPHEGV IIFDNVNFMY SPGGPLVLKH LTALIKSQEK VGIVGRTGAG 1080 

KSSLISALFR LSEFEGKXWI DKILTTEIGL HDLRKKMSII PQEPVLFTGT MRKNLDPFNE 1140 

HTDEEXWNAL QEVQLKETIE DLPGKMDTBL AESGSNPSVG QRQLVCLARA ILRKNQILII 1200 

DEATANVDPR TDELIQKKIR EKFAHCTVI/r IAHRLNTIID SDKIMVLDSO RLKEYDEPYV 1260 

LLQNKESLFY KHVQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTOT SKGQPSTLTI 1320 
FETAL 

SEQ ID KO:43 PBQ7 ONA SEQUENCE 

Nucleic Acid Accession ft NMJK1233 

Coding sequence: 34-1 1 1 9 (underlined sequences correspond to slart and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TAAATGAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC ACCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTTCCTT ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAAGTAAA CTTGGTGAAA GGACACAGGT 

SEQ ID NO:44 PBQ7Proteln sequence 
Protein Accession*: NP.067056 

1 11 21 31 41 51 

111)11- 

MMARLLRTSF ALLFLGLFGV LGAATISCRN EEGKAVDWFT FYKLPKRQNK ESGETGLEYL 60 

YLDSTTRSWR KSEQLMNDTK SVLGRTLQQL YEAYASKSNN TAYLIYNDGV PKPVNYSRKY 120 

GHTKGLLLWN RVQGFWLIHS IPQFFPIPEE GYDYPPTGRR NGQSGICITF KYNQYEAIDS 180 

QLLVCNPNVY SCSIPATFHQ ELIHHPQLCT RASSSEIPGR LLTTLQSAQG QKFLKFAKSD 240 

SFLDDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNIKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCIGD LNRSPHQAFR SGGFICTQNW QIYQAPQGLV LYYESCK 

SEQ ED NO:45 PCQ8 DNA SEQUENCE 

Nucleic Acid Accession*: XM_030453 

Coding sequence: 69-1273 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTGACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTOAA AATATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGACCAAGC TGGCCAAGGA CGTGCTGGAC ACCATCCTAG GCATCCAACC 420 
CAAGGACACC TCTGGTGGAG GGGATGAGAC CCGG6AGGCG GTGGTGGCCC GGCTGGCTGA 480 
TGATATGCTG QAQAAGCTGC CCCCAGACTA TGTCCCCTTT GAAGTAAAAG AGAGGCTGCA 540 
GAAGATGGGG CCATTCCAGC CTATGAACAT TTTCCTCAGG CAGGAAATAG ACAGAATGCA 600 
AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AAACTTGCTA TTGATGGCAC 660 
CATCATCATG AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 720 
TGCTTGGTGG AAAAAAGCTT CTTGGGTTTT TAGTACACTG GGTTTCTGGT TTACTGAACT 780 
TATAGAAAGA AACAGCCAGT TTACCTCGTG GGTTTTCAAT GGCCGACCTC ACTGCTTTTG 840 
GATGACGGGT TTTTTTAACC CCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 900 
GGCCAACAAA GGCTGGGCTC TGGACAATAT GGTGCTTTGC AATGAAGTCA CCAAATGGAT 960- 

GAAGGACGAC ATTTCTACCC CTCCCACAGA GGGTGTCTAT 6TCTATGGCT TATATCTTGA 1020 

AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATTGAA TCAAAGCCAA AAGTGCTCTT 1080 

TGAGTTGATG CCTGTCATAA GGATTTATGC AGAAAACAAT ACTTTACGAG ATCCTCGGTT 1140 

TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TTGCCGCTGT 1200 

GGATCTCAGG ACAGCCCAGA CCCCTGAACA CTGGGTGCTC CGTGGGGTTG CCCTTCTGTG 1260 

TGATGTCAAG TAACATGTGG GGAGTGTCCC CACCCAATGC TTTGGAAAAT GCAAGATCTA 1320 

AATTATTGTA ACCTTTATTT CTGTATGACT GCTGGACAGT GTATGTTAGQ TCGTTTATGC 1380 

AATTAATGAG CTGCATAGGT TTTCCCCACT CCTTAATTGG ATGCTTATAT TTTACTTGTT 1440 

A TCATCATTAG TGACCAATGT CTGAGTTTGT TGAAAATGTT ATTTAGTGAT ATAAAAGTAA 1500 

20 ATTTACAGCA TCCTAATGAA GTGTGGCCCT CAAATCCACA GTAGTATATT TTCTTCTTAC 1560 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAATATA TTTGCATGTG GACAAAGATT 1620 

AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGATAGCAA GAATTATAGT TGGCTTGAAA 1680 

AAATGTGATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA ATATTAGACG GTGCGTAGGG 1740 

ACTTTCTATG GACTTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 1800 

CTTTAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAATA 1860 

TAGTCAGTAC TAAATTAGAA TTGTGGTTTA TAAACTTTTG GTTAGCTCTG GATCTGTATA 1920 

ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA CCGGGAGACA AGTGTGGGTC 1980 

CCTCTCACTG GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 2040 

„ GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 2100 

30 TTCTTTCTTT TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 2160 

CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 2220 
TTTACTAAAA AAAAAAAAAA AAA 

0 _ SEQ ID NO:46 PCQ8 Protein sequence 
35 Protein Accession*: BAB 15543 

1 n 21 31 41 51 

I I I I I I 

MDVKKGVSWT TIRYHIGEIQ YGGRVTDDYD KRLLNTFAKV WPSENMFGPD FSFYQGYNIP 60 
40 KCSTVDNYLQ YIQSLPAYDS PBVFGLHPNA DITYQSKLAK DVLDTILGIQ PKDTSGGGDE 120 
TREAWARLA DDMLEKLPPD YVPFEVKERL QKKGPFQPMN IFLRQEIDRM QRVLSLVRST 180 
LTELKUUDG TIIMSENLQD ALDCMFDARI PAWWKKASWV FSTLGFWFTE LIERNSQFTS 240 
WVFNGRPHCF WMTGFFKPQG FLTAMRQEIT RANKGWALDN MVLCNEVTKW MKDDISTPPT 300 
e EGVYVYGLYL BGAGWDKRNM KLIESKPKVL FELMPVIRIY AENNTLRDPR FYSCPIYKKP 360 
45 VRTDLNYIAA VDLRTAQTPE HWVLRGVALL CDVK 

SEQ ID N0:47 PDG5 DNA SEQUENCE 

Nxlelc Add Accession*: AB033O36 
^ Coding sequence: 68-3349 (underfilled sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTTGC TTTCAGATAA 60 

AGATGACATG GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 
55 ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA . 180 

AGATGCAGCT TCTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCCCAGGA TGTTCAAACT ATCTGCAAAG AAAAGCCTTC 420 

60 TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCTCC CAGAAGCCTT TTTCAGTCCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGATTCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 600 

AGAACTGGCT CATGGTCACT CTTCCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

- c CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

65 CAGATGCCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAGTACA ACACTTCTGA TGATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG CCTTGGGAAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATQA TTTTATGCAG CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTGTTCA GCAACAAGTC CCCACCAGTT CAGTGGGCAC 1020 

70 TTCTATAAAA CAGAGCGATT CCGTGGAGCC AATCCCTCCA AGACACCCTT TCCAGCCATG 1080 

GGTGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATGAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATGTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

_ _ GGAGCCACTA CTCCCCAGAT ATTCTCCTCA GTCCTTGACA GATCCTCAAA TCCGGCAAAT 1320 

75 CTCAGAAAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAGAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CACTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 

ork TAAGGAGCAG CTGCTTCCCA GACATCTTTC CCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

80 ACTGTCCTCA AATTTCGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA OTCCATTGCC 1680 
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TCCCCAATAT GCTACCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACCG ATTCCCAGGC GTCCGACCCA 1800 

GTCATTCGTG AAATTTATGG CACAGCAAAT CTTTTCAGAG AGCTCTGCTC TTAAGAGGGG 1860 

CAGTGATGTG GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA AGCCTGAAGT 1920 

CAAGCACCAA GTTTTCTCAG ATTCAGGGAG TGCTAATCCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAAGACC CACAGAAAGT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAGGAAACC TGAGTATGAG CAAAAAGTCT CCCCTGTTTC 2160 

TGCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAG CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA GATAGGTCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACCTGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGCGATCT GTTTTTGAGA GCAATTCTGA 2400 

CAATTGGTTC CTAGGAAGAG ATGAAGCTTT TGCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTCCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGA? GOTAATAATA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCCC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGCCCA ATTTCATCCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCT6 CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

CCATAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AAGTTCCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

TCTCCCAGCC AAGTTCCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTGGAGCATC AGCATTTATT TTATTTAGTT TTTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

CTCGCTCTGT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTCCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG CCCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

CTGGGATTAC AGGCGTGAGC CACCGCGCCC GGCCAAGCAT CAGCGTTTTA AATGATAATT 3720 

GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT CCCTCAACCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

CTTTCTCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACCCCACGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA ATTATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GCTTCTTGTC 4500 

CCAATACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTOC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TOTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC TGGAGAGACC CTACCTTCCA GTQACCCACT 4920 

CATCCCCCAG CCACGGAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

CCATCTTGCC ACACGGTCTT TTTCTTTTGT AGCACAGCCT CCATTAATAA CTCCTCGGCT 5100 

GAGGATGAAG ATGTAGGCAC CTTTACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA CCACCCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTCTGTCTT TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTOCCT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5580 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT OAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTACTCC 5700 

TGGTAGATTA GCTGTTACAC CTCCCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

BEQ IP un^R ppQS Protein secuence 
Protein Accession #: BAA86524 

1 U 21 31 41 51 

I t i I I I 

EQPTTSQPBT TTPQGLLSDK DDMGRRKAGI DFGSRKASAA QPIPENMDNS HVSDPQPYHE 60 
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DAASGAEKTE ARASLSLMVE SLSTTQEEAI LSVAAEAQVF MNPSHIQLED QEAFSFDLQK 120 

AQSKMBSAQD VQTICKEKPS GNVHQTFTAS VLGMTSTTAK GDVYAKTLPP RSLFQSSRKP 180 

DAEEVSSDSE NIPBEGD6SB ELAHOHSSQS LGKFEDEQEV FSB5KSFVED LSSSEEELDL 240 

RCLSQALEEP EDAEVFTESS SYVEKYNTSD DCSSSEEDLP LRHPAQALGK PKNQQEVSSA 300 

SNNTPEEQND FMQQLPSRCP SQPIMNPTVQ QQVPTSSVGT SIKQSDSVEP IPPRHPPQPW 360 

VNPKVEQEVS SSPKSMAVEE SISMKPLPPK LLCQPLMNPK VQQNMFSGSE DIAVERVISV 420 

EPLLPRYSPQ SLTDPQIRQI SESTAVEEQT YVEPLPPRCL SQPSERPKFZ, DSMSTSAEWS 480 

SPVAPTPSKY TSPPWVTPKF EELYQLSAHP ESTTVEEDIS KEQLfcPRHLS QLTVGNKVQQ 540 

LSSNFERAAI EADISGSPLP PQYATQFLKR SKVQEMTSRL EKMAVEGTSN KSPIPRRPTQ 600 

SPVKPMAQQI FSESSALKRG SDVAPLPPNL PSKSLSKPEV KHQVFSDSGS ANPKGGISSK 660 

MLPMKHPLQS LGRPEDPQKV FSYSEHAPGK CSSFKEQLSP RQLSQALRKP EYEQKVSPVS 720 

ASSPKEWRNS KKQLPPKKSS QASBRSKFQP QMSSKGPVNV PVKQSSGEKH LPSSSPFQQQ 780 

VHSSSVNAAA RRSVFESNSD NWFLGRDEAF AIKTKKFSQG SKNPIKSIPA PATKPGKFTI 840 

APVRQTSTSO GIYSKKEDLE SGDGNNNQHA NLSNQDDVEK LFGVRLKRAP PSQKYKSEKQ 900 

DNFTQLASVP SGPISSSVGR GHKIRSTSQG LLDAAGNLTK ISYVADKQQS RPKSE5HAKK 960 

QPACKTPGKP AGQQSDYAVS EPVWITMAKQ KQKSFKAHIS VKELKTKSNA GADAETKEPK 1020 

YEGAGSANEN QPKKMFTSSV HKQEKTAQMK PPKPTKSVGF BAQKILQVPA MEKETKRSST 1080 
LPAKFQNFVE PIEPVWFSLA RKKAKAWSHM AEITQ 

SEG ID NO;49 PAB7 DNA SEQUENCE 

Nucleic Add Accession!: D87742 

Coding sequence: 208-3582 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GCTTTCCTTT CTAAAGTAGA AGAGGATGAT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

AACGCTATAA ATGCAAAACG GTCTAAAGAA AAAAACCCTG GGAATCAGGG CAGGCAGTTT 120 

GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAGA AACTAGTATQ ATTTTGGATA GTGAAAAAAC AAGTGAGACT 240 

GCTGCCAAAG GGGTCAACAC AGGAGGCAGG GAACCAAATA CAATGGTGGA AAAAGAACGC 300 

CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG CCAGGGGTCT 540 

GCTGCTGCAG AACCTGAAGA TGACTCGTTC CACTGGACTC CACATACAAG TGTAGAGCCA 600 

GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGCCTGC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

GATACTCGTG TGGCTGAAAA TAGAGATCTG GGAATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

GCAGAGGAGA CAGCCACACT GGTGATGGCA CCACCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATGTGCAGG TTCCTGAAGA ACCCACCCAC TTGGACCAAC GTGTGATTGG GGACACTCAT 1140 

GCCTCAGAAG TGTCACAGAA GCCAAATACT GAGAAAGACC TGGACCCAGG GCCAGTTACA 1200 

ACAGAAGACA CTCCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGCCGCCGAA 1260 

6AGCCGGCAA GTGTCACACC TTTGGAAAAC GCAATCCTTC TAATATATTC ATTCATGTTT 1320 

TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGO GCCTGATTTT 1380 

TATGGACTGC CATGGAAACC TGTATTTATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

ATTTTCTTAT GGAGAACTGT CCTTGTTGTG AAGGATAGAG TATATCAAGT CACGGAACAG 1500 

CAAATTTCTG AGAAGTTGAA GACTATCATG AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AATCTTCGTG TTATGCTAGA ATCTGAGAGA 1740 

GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 1800 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 1920 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAG CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAGTAGCT 2040 

CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG AAGTGGGAGG TGACCGGAAT GAGAAGATGA AAAATCAAAT TAAGCAGATG 2220 

ATGGATGTCT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCCTC CGTGTCCACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AGAAAGTGGA GATTCTGAAT GAGCTCTATC AGCAGAAGGA GATGGCTTTG 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA CGGCAAGAAA GAGAGCACAG GCTGTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGCG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA GAAGACAGAG CGGTCATTTA AAAACCAGAT CGCTACCCAT 2640 

GAGAAGAAAG CTCATGAAAA CTGGCTCAAA GCTCGTGCTG CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 2760 

ATGCTGCAAG AAGAACCTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGCTCTTTTG GCCCATCCCC TGTGAGTGGT 2880 

GGAGAATGCT CCCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

AATCGAAGAG ATATGCCTAG AAGTGAATTT GGATCAGTGG ACGGGCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAACCC TCTCCTTCTG ATCCAGGATC TGGTACAGCT 3060 

ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG 3120 

GTTAATATGG CTCCAAAAGG GCCOCCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCCCC 3180 

ATGGOAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

TTTGGGCCTC GGCCACTTCC TCCACCCTTT GGCCCTGGTA TGCGTCCACC ACTAGGCTTA 3300 
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AGAGAATTTG CACCAGGCGT TCCACCAGGA AGACGGGACC TGCCTCTCCA CCCTCGGGGA 3360 

TTTTTACCTG GACACGCACC ATTTAGACCT TTAGGTTCAC TTGGCCCAAG AGAGTACTTT 3420 

ATTCCTGGTA CCCGATTACC ACCCCCAACC CATGGTCCCC AGGAATACCC ACCACCACCT 3480 

GCTQTAAOAQ ACTTACTGCC OTCAGGCTCT AGAGATGAGC CTCCACCTGC CTCTCAGAGC 3540 

ACTAGCCAGG ACTGTTCACA GGCTTTAAAA CAGAGCCC AT AAA ACTATGA CCTCTGAGGT 3600 

TTCATTGQAA AGAAAGTGTA CTGTGCATTA TCCATTACAG TAAAGGATTT CATTGGCTTC 3660 

AAAATCCAAA AGTTTATTTT AAAAGGTTTG TTGTTAGAAC TAAGCTGCCT TGGCAGTGTG 3720 

CATTTTTGAG CCAAACAATT CAAAAATGTC ATTTCTTCCC TAAATAAAAA TCACCTTTTA 3780 

AGCTAGAGCG TCCTTACAAC TTTGAAATGT GCAATAAAGA ATACCTGTGT TTTAGCTAAT 3840 

GTAGCATATG TAATTGCAAA ATGATTTAGA ATGTCATGAA AAATATGAAC ATTTCCTGTG 3900 

GAAATGCTTT AAGAACATGT ATTTCCATTA TCCTATTTTT AGTGTACACC AGCTGAATAC 3960 

GGAGCAATGG TGTTTATAAG CGTTTTTTTA AACTATCTGG TCACAAAGAC TGTTACGCTA 4020 

AAAATGTTTA CTAAAAGATC ACTAAACTAT CTCCCCTCTT GCTGAAGTTC TTTGTAGTAA 4080 

TAGCTCATAA AAATTTGTTT ATTAATATTT CCCAAGTGTC TGTTGACTCA TTGGACTGTT 4140 

ATGAGGCTTG TGCCATTTGG GGAACATGTA AACTCAGGCT OCCAGAACTG AAGATGGTGG 4200 

CTGGTGGCAC ACTTCCGGCT GCTCCTCCGT CACCTGTGAA CTCTACAAGT GATGTCTTTT 4260 

TATTTCAAAQ AAGTTTATTT CCCACTTGTA TAGCATTCAC ATGCTTTCTT TACGATOCTC 4320 

ATTGTCTATT TGAGAATGGT TTTCTGAGAG TGAGTTTACA TTAGTAGCAA GAGTTGTTTG 4380 

ACCTGATGTT CCATTGTTTT TACCATTCCT GTAGAAAAAG GGTGCACAAC AGAAAAATGA 4440 

AAATGATGTG TCATGGCCAT AAAAGTATAG AAATCTTTAA AAATTTTAAA ATGTACAGTC 4500 

CCTTATCTAT CTTTCCCATT CCTTGCCACT GATTTTTGAG GAATATAATA AAAAGATTGG 4560 

AAGAGTATAA TGCCATGAGA AAGAATGATT TAGGACTGTG AGGGTTATAA CATGCCCTAG 4620 

GTCAGCAACC AAGGGTTGAA ATCAGTTCTG TTTTAGGGGG AAATGGGGGG GGCGACAGAT 4680 

ATTATTCCAA AATTAATATT AATTAATATT TAAACGTTGO TGTTTTTATT TAAAAATCAG 4740 

TAACTAACCA TCTGGAATTG CACCATACTT AAAGTCTTAT CCATTACTAC ACTGTCTTTA 4800 

AAACAATGTT TCTTTAAATA CTCTACAACG TTTCTAAGAA CGAACTTCAG ACATTTTAAT 4860 

TACAGTAATA ATAGCACTCC TTTTAAGGAG TTTCAGATCC ACACTAAAAC TAAAATCATA 4920 

AAAGGCTGAT ACTTTTGTTT GCTGCTAGGC TATATTCTTC CATTCTTTGA AGTCCTATGA 4980 

TGTAATATTT TTGAAACCTA GTGTATGTCT TGTCACTGTT GTGATATTTA ATCGATTAAG 5040 

AATACCTTGT AAAAAGGAGC AAAAGCTTCA ATGTGAAACA ATTTTCTCTC TTTATACTAA 5100 

ACAACTGAAG ATAGATAGTT TAGAAAGATA AGGACCTTTG AAAGAAGACA ACTCTGTCAA 5160 

AGTTCATAAG GAATATAAAA ATTCTTCAGG AAAAGAGAAT TCAATCTATA TGTCCTCCCG 5220 

TTTAATATCA AGAATAGAAG AAATTAAGAG GAAAACTCCA CAGAAGAGCA TAGGCCACTT 5280 

TTAGCCATGT AAAAATAAGA TTAAGTCACA AATACAACTT TTGAATTTAC CTGTCAATAT 5340 

CTCTTTAGGA CACAAAACAA TGCTGAAGTT AATATAATTT CTAATTTTAA ATGTCATTTA 5400 

AGTGTAGATT ATGCCATCTA GGAAGGTAAG TAGGAAAGGT AAATTAAATC TATTTTTAAA 5460 

ATTCAAAATA TTAGAGTATT TTTCCCCTCT AAAGCCTTTT TTGGTGATTA TTCTGTATCT 5520 

GACATAATTG AGAAACTGGT AAGCTGTAAA GATTCCAGTG TAGCTTCTCT GAGAAGTTGT 5580 

GAGCCAGTCC ATAACTGCTT CCTCACATCC ATCTGATTGC ACCATTTCTG CAGCAAACCC 5640 

CAAAGCAGGG TGCCAATATG CAGATGGCAT AGGGAGTATC ATCCCTCAGC CAAATCACTT 5700 

TTCCATCTCT AAAGTTTCAT CTATTTTGGA AGTCATCTCC AACTAATTGT GTCTGGATTT 5760 

AGTTGCTAAA ATTGTCTTAT TTATTTATGA AGCAGCAATA TTCAGCCTGA AAGCATTTCT 5820 

GCCATAGTTG TTGTAGTTAT ATCGCCAATG GCTGATTTTT TTCATTGGAA AGTAAATTTA 5880 

AGTAATTCGT GGGATGTGGT ATATTCTGTG TCAACTTCAA GATAATCACT CATTTTCTCG 5940 
TTATATTCAG GTCTGAATTA AAGTTAAGTT AATCAC 



SEQ ID NO:50 PAB7 Protein sequence 
Protein Accession*: BAA13448 

1 11 21 31 41 51 

I I I I I I 

AFLSKVEEDD YPSEELLEDE NAINAKRSKE KNPGNQGRQF DVNLQVPDRA VLGTIHPDPE 60 

IEESKQETSM ILDSEKTSET AAKGVNTGGR EFNTHVEKER PLADKKAQRP PERSDFSDSI 120 

KIQTPELGEV FQNKDSDYUC NDNPEEHLKT SGLAGEPEGE LSKEDHGNTE KYMGTBSQGS 180 

AAAEPEDDSP HWTPHTSVEP GHSDKREDLL IISSFFKEQQ SLQRFOKYFN VHELEALLQE 240 

MSSKLKSAQQ ESLPYNMEKV LDKVFRASES QILSIAEKML DTRVAENRDL GMNENMIFEE 300 

AAVLDDIQDL IYFVRYKHST AEETATLVMA PPLEEGLGGA MEEMQPLHED NFSREKTAEL 360 

NVQVPEEPTH LDQRVIGDTH ASEVSQKPNT EKDLDPGPVT TEDTPMDAID ANKQPETAAE 420 

EPASVTPLEN AILLIYSFMF YLTKSLVATL PDDVQPGPDF YGLFWKPVFI TAFLGIASFA 480 

IFLWRTVLW KDRVYQVTEQ QISEKLKTIM KENTELVQKL SNYEQKIKES KKHVQETRKQ 540 

NMILSDEAIK YKDKIKTLEK NQEILDDTAK NLRVMLESER EQNVKNQDLI SENKRSIEKL 600 

KDVISMNASE FSEVQIALNE AKLSEEKVKS ECHRVQEENA RLKKKKEQLQ QEIEDWSKLH 660 

AELSEQIKSP EKSQKDLEVA LTHKDDNINA LTNCITQLML LECESESEGQ NKGGNDSDEL 720 

ANGEVGGDRN EKMKNQIKQM MDVSRTQTAI SWEEDLKLL QLKLRASVST KCNLEDQVKK 780 

LEDDRNSLQA AKAGLEDECK TLRQKVEILN ELYQQKEHAL QKKLSQEEYE RQEREHRLSA 840 

ADEKAVSAAE EVKTYKRRIE EMEDBLQKTE RSFKNQIATH EKKAHENWLK ARAAERAIAE 900 

EKREAANLRH KLLELTQKMA MLQEEPVIVK PMPGKPNTQN PPRRGPLSQN GSFGPSPVSG 960 

GECSPPLTVE PPVRPLSATL NRRDMPRSEF GSVDGPLPHP RWSAEASGKP SPSDPGSGTA 1020 

TMKNSSSRGS SPTRVLDEGK VNMAPKGPPP FPGVPLMSTP MGGPVPPPIR YGPPPQLCGP 1080 

FGPRPLPPPF GPGMRPPLGL REFAPGVPPG RRDLPLHPRG FLPGHAPFRP LGSLGPREYF 1140 
IPGTRLPPPT HGPQEYPPPP AVRDLLPSGS RDEPPPASQS TSQDCSQALK QSP 

SEQ ID N0:51 PAB9 DMA SEQUENCE 

Nucleic Acid Accession*: NM_006457 

Coding sequence: 84-1874 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGACTGAGGC GGAGGCAGCC CCGCGCCGCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 60 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
GTTCTGTGTC 
CCCATGCGAC 
TGTTCOCTGC 
CACTGAGCGC 
GTTCCGAGAC 
AACA6CAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGG 
TGGAAGGTCA 
CTGTGAATTT 
AAATTAAAAT 
AGTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GCGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
TGTGTCATGC 
GCTACCATAT 
TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGGGAGAA 
AGTCACTTCC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCAG 
AGAATCTGAA 
GGCTTCCTTG 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGOTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGGA 
CAGTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 

GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GAGGAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 

AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTACCTTCCA 
AACTCAGCTT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACCACTG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



SEQ (P Nfr52 FAB9 PWitfn sequence 
Protein Accession #: NP.006448 



120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



1 
I 

1 MSNYSVSLVG 
61 MTHLEAQNKI 
121 NNMAYNKAPR 
181 ANANliSADQS 
241 KHIVERYTEF 
301 DNTKKANNSQ 
361 PSWQRPNQGV 
421 AHCNQVIRGP 
481 RCQRKILGEV 
541 CEFPIEAGDM 



11 

I 

PAPWGFRLQG 
KGCTGSLNMT 
PFGSVSSPKV 
PSALSAGKTA 
YHVPTHSDAS 
EPSPQLASLV 
PSTGRISNSA 
FLVALGKSWH 
INALKQTWHV 
FLEALGYTWH 



21 
I 

GKDFNMPLTI 
LQRASAAPKP 
TSZPSPSSAP 
VNVPRQPTVT 
KKRLIEDTED 
ASTRSMPESL 
TYSGSVAPAN 
PEBFNCAHCK 
SCPVCVACGK 
DTCFVCSVCC 



31 

1 

SSI/KDGGRAA 
EPVPVQKGBP 
TPAHATTSSH 
SVCSETSQEL 
WRPRTGTTQS 
DSPTSGRPGV 
SALGQTQPSD 
NTMAYIGFVE 
PIHNNVFHLE 
ESLEGQTFFS 



41 
I 

QANVRIGDW 
KEWKPVPIT 
ASPSPVAAVT 
AEGQRRGSQG 
RSFRILAQIT 
TSLTTAAAFK 
QDTLVQRAEH 
EKGALYCELC 
DGEPYCETDY 
KKDKPLCKKH 



51 

I 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 
GTEHLKBSEA 
PVGSTGVIKS 
IPAGKRTPMC 
VEKFFAPECG 
YALFGTICHG 
AHSVNF 



SEQ (D NCh53 PBH7 DNA SEQUENCE 

Nucleic Acid Accession*: AA431407 

Coding sequence: 1-864 (undefined sequences comespond to start and stop codons) 



60 

120 

180 

240 
300 
360 
420 
480 
540 



11 



31 



41 



51 



21 

I t I 

ATGGCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGAGCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGGGAATG 
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AAOATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTCCA GTTTCATATG 240 

GAGGCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCQCTGACCC TGGCAGCCAG 300 

6GCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAT GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTG 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTCCTGTCCC ATGACAAGGA TCAGCTGACC 720 

AAGGAACTGC AGCAGCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 780 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAG AACGCACTGC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 

SEQ ID NfrMPBH? Protein fieouervtt 
Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MANCKMTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGLL LYENYGQSET GLICATYWGM 60 

KIKPGFMGKA TPPYDVQPHH EASVEKCIIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 

NTEGNIGIRI KPVRPVSLFM CYEGDPEKTA KVECGDFYKT GDRGKMDEEG YICFLGRSDD 180 

IZNASGYRIG PAEVE SALVE HPAVAESAW GSPDPIRGEV VKAFIVLTPQ PLSHDKDQLT 240 
KELQQHVKSV TAPYKYPRKV EFVSELPKTI TGKIERKELR KKETGQM 

SEQ ID N0:55 PBJ5 DNA SEQUENCE 

Nucleic Acid Accession*: AF388200 

Coding sequence: 33-137 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CQATGTGCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GAA CTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

SEQ ID NfrSS PBJS Protein sequence 
Protein Accession*: AAK83352 

1 11 21 31 41 51 

I I I I I I 

MCCEIYYRLL VLKMEKKSEE LRNMDGLGNV EKGH 

SEQ ID KO:57 PBJ7 DNA SEQUENCE 

Nucleic Acid Accession*: AA876910 

Coding sequence: 1-2084 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

ATGG ACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCO AGCAGGGAGA ACGAAAAGCT 240 

GTTTCTTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCOGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTTATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACCGGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA ATACCTGGTT GGCCTGCACC TCAGGTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCAGGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCT6AGTTAC ATCCCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCCCT ATTGGCTGGT CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGGATGCT 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGGTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC TAGGAGAAAG TTGTTGCTTC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGGTTA 1920 

GCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA AOGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

SEQP NP;W PR/7 Protein m 

Protein Accession*: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

KDSCLQHKRD LLYLLQELRC LNPATLLPDP DSTTPVHDCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SPLEQGERKA VSFPQPDLPD NPTYSTEEEK LASDVGANKN QEGRVFANTT 120 

WRAGTSKEVS FAVDLCVLPP EPARTHEBQH NLPVIGAGSV DLAAGFGKSG SQTGCGSSKG 180 

AEKGLQNVDF YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISRVPHPK 240 

I/CTRKNCNPL TITVHDPNAA QWYYGMSWGL RLYIPGFDVG TMFTIQKKIL VSWSSPKPIG 300 

PLTDLGDPIF QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS LMSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAK PPYYVGLGVE ATLKRGPLSC HTRPRALTIG DVSGNASCLI STGYMLSASP 420 

FQATCNQSLL TSISTSVSYQ APNNTWLACT SGLTRCIKGT EPGPLLCVLV HVLPQVYVYS 480 

GPEGRQLIAP PELHPRLHQA VPLLVPLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAID ILHSQVESLA EWLQNCRCL DLLFLSQGGL CAALGESCCF YANQSGVIKG 600 

TVKKVRENLD RHQQERENNI PWYQSMFNWN PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNFIKQRIA SVKLTYLKTQ YDTLVNN 

SEQIDNO;59 PCQ1 DMA SEQUENCE 

Nucleic Acid Accession*: NM_0 19005 

Coding sequence: 182-1885 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

1 I I I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAQ AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTO CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GATAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGACC AGAAACTTCT 780 

CCTTGCTGGT ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCCAAAAGAT 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATGGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT CCCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAQA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAQATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG GATQAGAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTCCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 
AACTGGTTTA CATGGTGTCA TAATTGCAGG CACGGTGGAC ATGCTGGACA TATGCTTAGT 2700 
TGGTTCAGGG ACCATGCAQA GTGCCCT6TG TCTGCATGCA CGTGTAAATG TATGCAGTTG 2760 
c GATACAACGG GGAATCTGGT ACCTGCAGAQ ACTGTCCAGC CATAAAATGT TACCACCTTA 2B20 
J AGAGAACCCT TCAAGTGTOO AGCTTTCTAO TAGGTGTCCT TCATAGCTCA GAAACATACC 2880 
TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
AAAAAAAAAA AAAAAAAAAA 

. rt SEQ ID NO:60 PCO1 Protein sequence 
1 0 ProteJn Accession #: NP.061 878 

l 11 21 31 41 51 

I I 1 1 I I 

- MSGTKPDILW APHHVDRFW CDSELSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 

15 PYMKCVAWYL NYDPECLLAV GQANGRWLT SLGQDHNSKP KDLIGKEFVP KHARQCNTLA 120 

WNPLDSNWLA AGLDKHRADF SVLIWDICSK YTFDIVFHEK VKLSAGETET TLLVTKPLYE 180 

LGQNDACLSL CWLPRDQKLL LAGMHRNLAI FDLRNTSQKM FVNTKAVQGV TVDPYFHDRV • 240 

ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GLLATLTRDS NIIRLYDMQH 300 

TPTPIGDETE PTIIBRSVQP CDNYIASFAW HPTSQNRMTV VTPNRTMSDF TVFERISLAW 360 

20 SPITSLMWAC GRHLYECTEE ENDNSLEKDI ATKMRLRALS RYGLDTEQVW RNHILAGNED 420 

PQLKSLWYTL HFMKQYTEDM DQKSPGNXGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 480 

SDIQNLNEER ILALQLCGWI KKGTDVDVGP FLNSLVQEGE WERAAAVALP NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYG 

25 SEQ ID N0:61 PDQ3 DNA SEQUENCE 

Nucleic Add Accession* U42359 

Coding sequence: 563-775 (undefined sequences correspond to start and stop codons) 
„ ' 1 11 21 31 41 51 

30 | | | | | | 

TTGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

GATCAGCCCA CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCTGGC TTTGTAAAAT GACTTATAAA GGTCCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCCCCTGT CTTATCTAGA TAAAGAATGT 240 

35 AGTGCTAAAT CTTGTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCCTCA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

A _ AAGTTOTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

40 TTCCAGTCTT ACATTATTAT GTTTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCA TTAA CA 780 

Ac AATAATGTAA GCCTTAATAT TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 

45 CTTACTTGAA AACTTT 

SEQ ID N0£2 PDG3 Protein sequence 
Protein Accession*: AAB18375 

50 1 11 21 31 41 51 

I i i 1 I I 

HGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

SRRSIFRMNG DKFRKFIKAP PRNYSMXVMF TALQPQRQCS VCRQANEEYQ XLANSWRYSS 120 

AFCNKLFFSM VDYDEGTDVP QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

55 WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKtfG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGCVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 

SEQ ID N0163PDG6 DNA SEQUENCE 

60 Nucleic Add Accession*: AL080235 

Coding sequence: 245453 (underlined sequences correspond to start and stop codans) 

1 11 21 31 41 51 

AK I I I I I I 

CO GGTCGCCGCA CCGGCCGCCT CCGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGCGCTGCC AGCCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGGCTG 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

n _ CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

70 TTCCTGCCCA ACGGCATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGCGACCAAG TGAC CCGCTC CGCTCCTCCC TGTGTCCGTC 480 

CTGTGTCCGC ' GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCCGGTGTG CTTCGTGCTG 540 

— - TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

75 AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

D _ AAAAAATACA AAACAAAAAG ATTAAATTGC TATTGCTGTA GTAAGAGAAG CTCTTTGTAT 900 

80 CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 
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CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 
TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

5 SEQlDNO:64PPggPreleln5gWgnCT 
Protein Accession!: CAB45781 

1 11 21 31 41 51 

in 1 1 1 1 1 > 

1U GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHFCCL DFSLEELQGE 60 
PGWRLNRKPI ESTLVACFMT LVIWWSVAA LIWPVPIIAG PLPNGMEQRR TTASTTAATP 120 
AAVPAOTTAA AAAAAAAAA A AAVTSOVATK 

- - SEQtD NO:65 P0M1 DMA SEQUENCE 

15 Nucleic Acid Accession*: NM_006765 

Coding sequence: 149-1 1 95 (underlined sequences correspond to start and stop codons) 



, 1 U 21 31 41 51 

20 | | I.I I I 

CGGCCGCGGC CCGGGTCCCT CGCAAAGCCG CTGCCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGGAGGC TGGCCGGGCA GGCGTGGTGC GCGGTAGGAG CTGGGCGCGC ACGGCTACCG 120 

CGCGTGGAGG AGACACTGCC CTGCCGC GAT GG GGGCCCGG GGCGCTCCTT CACGCCGTAG 180 

GCAAGCGGGG CGGCGGCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAGAATC TTTTAGCTGA 300 

AAAAGTAGAG CAGCTGATGG AATGOAGTTC CAGACGCTCA ATCTTCCGAA TGAATGOTGA 360 

TAAATTCCGA AAATTTATAA AGGCACCACC TCGAAACTAT TCCATGATTG TTATGTTCAC 420 

TGCTCTTCAG CCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

ACTGGCGAAC TCCTGGCGCT ATTCATCTGC TTTTTGTAAC AAGCTCTTCT TCAGTATGGT 540 

GGACTATGAT GAGGGGACAG ACGTTTTTCA GCAGCTCAAC ATGAACTCTG CTCCTACATT 600 

CAYGCATTTW CCTCCAAAAG GCAGACCTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

TGGATTTGCA GCTGAGCAAC TAGCAAAGTG GATTGCTGAC AGAACGGATG TTCATATTCG 720 

GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 780 

TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTGA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTGGAAAAAG 1080 

. _ ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

40 AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG AGTGAGAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

CAAGTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAACTGTTCC TGACTTTATA 1320 

CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

AC AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

45 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
CAATAAATGA CAATGTAATT A 
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- A SEQ ID KO;66 PDM1 Protein sequence: 
50 Protein Accession *: NP_006756 

l 11 21 31 41 51 

I I I I I I 

_ HGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

55 SRRSIFRMNG DKFRKPIKAP PRNYSMIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKTG WAHVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 



60 
65 



SEQ ID NO:67 PDM2 DMA SEQUENCE 

Nucleic Acid Accession #: NM_000947 

Coding sequence: 88-1617 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I i 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAG GTGACCAGAG GAATGCTTCC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

nc AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTCCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CGAGAACAGG AGATTGTTGC CTCATCACCA 600 

ori AGTTTAAGTG GACTTAAGTT GGGGTTOGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 
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AAGOACATTQ TGGCAATCAT CCTGAATGAA TTTAGAGCCA AACTGTCCAA GGCTTTGGCA 780 

TTAACAGCCA GGTCCTTGCC TGCTGTGCAG TCTGATGAAA QACTTCA6CC TCTGCTCAAT 840 

CACCTCAGTC ATTCCTACAC TGGCCAAGAT TACAGTACCC AGGGAAATGT TGGGAAGATT 900 

TCTTTAGATC AGATTGATTT GCTTTCTACC AAATCCTTCC CACCTTCCAT GCGTCAGTTA 960 

5 CATAAAGCCT TGCGGGAAAA TCACCATCTT CGTCATGGAG GCCGAATGCA GTATGGCCTA 1020 

TTTCTGAAGG GCATTGGTTT AACTTTGGAA CAGGCATTGC AGTTCTGGAA GCAAGAATTT 1080 

ATCAAAGGAA AGATGGATCC AGACAAGTTT GATAAAGGTT ACTCTTACAA CATCCGTCAC 1140 

AGCTTTGGAA AGGAAGGCAA GAGGACAGAC TATACACCTT TCAGTTGCCT GAAGATTATT 1200 

CTGTCCAATC CACCAAGCCA AGGGGATTAT CATGGGTGCC CATTCCGTCA CAGTGATCCA 1260 

10 GAGCTGCTGA AGCAAAAGTT GCAGTCATAC AAGATCTCTC CTGGAGGGAT AAGCCAGATT 1320 

TTGGATTTAG TAAAGGGGAC ACATTACCAG GTAGCCTGTC AAAAATACTT TGAGATGATA 1380 

CACAATGTGG ATGATTGTGG CTTTTCTTTG AATCATCCTA ATCAGTTCTT TTGTGAGAGC 1440 

CAACGTATTC TAAATGGTGG TAAAGACATA AAGAAGGAAC CTATCCAACC AGAAACTCCT 1500 

CAACCCAAAC CAAGTGTCCA GAAAACCAAG GATGCATCAT CTGCTCTGGC CTCTTTAAAT 1560 

15 TCCTCTCTGG AAATGGATAT GGAAGGACTA GAAGATTACT TTAGTGAAGA TTCTTAGGCA 1620 

GTTTTATAAC CCTTTTTOCT CAATAGCCTG TTTCCTGTTT TTAAGATTTT GCCTTTGTTG 1680 

TTGAAAAAGG GTTTCACTGT CACCAAGGCT TAGTGCAGTG ACACAATTAC AGCTGATTGC 1740 

AGCCTTGACC TTCCCAGCTC AAGTGATCCT CCTACCTCAG CCTCCCAAGT AGTTAGGACA 1800 

_ _ CACAGGTGTG CACCTCATAT CCAGATAATT TTTTTCAATT TTTTTTTGTA GAGGTGGGGG 1860 

20 GTCTCCCTAT GTTGCCCAGG CAGATCTCAG ACTCCTGGGC TCAAGCGATC CTCACACCTC 1920 

AGCGTCCCAG AGTGCTGGGA TTACAOTTGT GAGCCACTGT GCCTGGCCTT TTTTTTTTTT I960 

TAACCTTTTC GTTTAACTTC TCTCTTCACT GCATCCCAAT CCATCTACAG GCATGCACAC 2040 

TTATTAGGAA AGGAGGTTTG AGGTAACAAC AGAGACTTTC ACTATATTTT GCTTTGACAG 2100 

AAGGAAAGAG GAGGAGTTTC TATTAAAATC TGTCACTTGA GTGATGTCAT TTAAGTCCTA 2160 

25 TTTTAGGAGA TAAAAACAGC TTTGGGGACT GGTTAAAGTC CCCCAGAAAC TACAATAAAG 2220 

AACAACTTTT GTTTTAACTC TTAATCACTT TGTAATTTTG ACTCAATCCT TTTCTGGACC 2280 
ATTTTTGTTA ATAAATATCA AAGTGT 

30 SEQ ID NO:68 PDM2 Protein sequence: 
Protein Accession #: NPJQ00938 

1 11 21 31 41 51 

IK 1 1 1 1 1 1 

DJ MEFSGRKRRK LRLAGDQRNA SYPHCLQFYL QPPSENISLT EFENLAIDRV KLLKSVENLG 60 

VSYVKGTEQY QSKLESELRK LKFSYREKLE DEYEPRRRDH ISHFILRLAY CQSEELRRWF 120 

IQQEMDLLRF RPSILPKDKI QDPLKDSQLQ FEAISDEEKT LREQEIVASS PSLSQLKLGF 180 

ESIYKIPFAD AL.DLFRGRKV YLEDGFAYVP LKDIVAIILN EFRAKLSKAL ALTARS LPAV 240 

, rt QSDERLQPLL NHLSHSYTGQ DYSTQGNVGK ISLDQIDLLS TKSFPPCMRQ LHKALRENHH 300 

40 LRHGGRMQYG LFLKGZGLTL EQALQPWKQE FIKGKMDPDK FDKGYSYNIR HSFGKEGKRT 360 

DYTPFSCLKI ILSNPPSQGD YHGCPFHHSD PELLKQKLQS YKISPGGISQ ILDLVKGTHY 420 

QVACQKYFEM IHNVDDCGFS LNHPNQFFCE SQRILNGGKD IKKEPIQPET PQPKPSVQKT 480 
KDASSALASL NSSLEMDMEG LEDYFSEDS 

45 SEQ U) NO:69 POMS DNA SEQUENCE 

Nucleic Acid Accession #: NK.024840 

Coding sequence: 108491 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

50 i i i I I I 

AATTCATACA GGAGAGAAGT CATATATATG CAGTGATTGT GGAAAAGGCT TCATCAAGAA 60 

GTCTCGGCTC ATTAATCATC AGAGAGTTCA TACAGGAGAG AAACCACATQ GATGCAGCCT 120 

GTGTGGGAAG GCCTCCTCCA AAAGGTCCAG GCTCACTGAA CACCAGAGAA CTCATACAGG 180 

_, _ AGAGAAGCCC TATGAATGCA CTGAATGTGA CAAAGCATTC CGCTGGAAAT CACAGCTCAA 240 

55 TCCACATCAG AAAGCTCACA CAGGAGAGAA GTCATATATA TGCCGTGATT GTGGAAAAGG 300 

CTTCATTCAG AAGGGAAATC TCATTGTACA TCAGCGAATT CATACTGGAG AAAAACCCTA 360 

TATATGCAAT GAATGTGGAA AAGGCTTCAT CCAAAAGGGC AACCTCCTTA TTCATCGACG 420 

TACTCACACT GGAGAGAAAC CCTATGAATG CAATGAATOT GGGAAAGGCT TCAGCCAGAA 480 

. _ GACATGT TTA ATATCCCATC AGAGATTTCA CACAGGAAAG ACACCCTTTG TATGTACTGA 540 

60 GTGTGGAAAA TCCTGCTCAC ACAAGTCAGG TCTCATTAAC CACCAGAGAA TTCACACAGG 600 

AGAGAAACCC TATACATGCA GTGACTGTGO GAAAGCTTTC AGAGATAAAT CATGTCTCAA 660 

CAGACATCGG AGAACTCATA CAGGGGAGAG ACCGTATGGA TGCTCTGATT GTGGGAAAGC 720 

TTTCTCCCAC TTGTCATGCC TTGTTTATCA TAAGGGAATG CTGCATGCAA GAGAGAAATG 780 

TGTAGGTTCA GTCAAATTGG AAAATCCTTG CTCAGAGAGT CATAGCTTAT CACATACACG 840 

65 TGATCTCATA CAGGATAAAG ACTCTGTTAA CATGGTGACT CTGCAGATGC CTTCTGTGGC 900 

AGCTCAGACC TCATTAACTA ACAGTGCGTT CCAAGCAGAG AGCAAAGTAG CCATTGTGAG 960 

CCAGCCTGTT GCCAGAAGTT CAGTCTCAGC AGATAGTAGA ATTTGCACAG AATAAAAACC 1020 

ATATGAATGC AGTGAATGTG GTAGTGCTTT CAGTGATCAA TTACATCATA TGTCACAAAA 1080 

„ AACACAGAGG AACAAACTGA TATATTCAAG GTGGAAAGCC CTTGAATAAA ACCTTATGGC 1140 

70 TAATAAGCAT ATACTCAGAG AAAAATAGTA TGAAGTGGAG ACTGGGAAAT TCTTTTATGG 1200 

GAAGATAGAT CTTCTCATCA GTGACCATAG ATCACATCTT CAGTGAGCTT ATAGTTGGTA 1260 

GAAATATAAT GATCATGGAA AAGTCCTTGT TCAGAAACAG TACGCCAGTA GGTATCAGGG 1320 

GGTTTACACA GGAGAGAAAC TTTTGGAAGA CCTTTGAAGG CTATGAATGT GGCAGGGTTG 1380 

CTAGTGGTAC ATTCTGCCTT ATCCTCAGAG GGAATCATAT AGAAATAAAA CTATGAAAAT 1440 

75 GTAACTAGAA CATCTTCATC AAAATATGAA AGAACACACG AAGCAAATAA GCCCTGTGAA 1500 

AAGGAGTATT TTAGAGATTT CGATCAGAAA TCTAACATCA TTATATGGCA GATAATATAC 1560 

AGGATGTGTA TTTTAGGACA ATATACCTTG AATCACTAGT TGATATGTCA ATGACTAATT 1620 

AAAAGGGGTT GTCAGTGTTA CACATCATTG GTTAAATTTA TAGCACAATG TACCTCTTCC 1680 

CCCTTTTTTG ATAAGAGTCT TCTATTCCCA ACCAAGATCA TTATATGATT AGCTCTTGTG 1740 

80 TTTCTTTGAT TCCAAATTTC TTCACTTGTT ATTTCAGACT ACTGAAGCTC TTCAAAAGGA 1800 

327 



WO 02/30268 



PC17US01/32045 



AAAATGTATT TAATTTAATA ATGTAACACA ACAAOTTTGG ATGTGTTTAA CTTTATAAAT 1860 
AATCACCCCA QAGGAATGAA GTTCAAAACT TGTGAATAAC C 



S SEQ ID NO:70 P0M3 Protein secuence: 
Protein Accession #: NP.079 1 16 

1 11 21 31 41 51 

10 1 1 1 1 1 1 

1U MDAACVGRPS PKGPGSLNTR BLIQERSPMN ALNVTKHSAG NHSSMHIRKL TQERSHIYAV 60 
IVEKASFRRE ISLYISEPIL EKNPIYAMNV EKASSKRATS LFIDVLTLER NPMNAMNVGK 120 
ASARRHV 

, - SEQ 10 N0:71 PDM8 DNA SEQUENCE 

ID Nucleic Acid Accession*: NMJH8455 

Coding sequence: 341 -955 (underlined sequences correspond to start and stop codons) 



20 
25 
30 
35 
40 



1 11 21 31 41 51 

I I I I I I 

AATTTCGGCA CGGGGGGGAG GCACAGTGAG TCCACTGGGG CACGGCAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACTGGTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCTGATGC TTGTGCTCCG GTGGGCGGAC 240 

TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAGCGGG AACAGCTCTT 300 

GAGGAGTGAG ACTGCAGGAG ATGTGGGCCG TGCCAAAGAG ATGQATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGGGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

GATGAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA GTCAGCTTCA GAGAAACTGA 720 

GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAGC CAAACCAGTA 780 

CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 840 

GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

CCGACAAGAG GAGATCATTT TAGATATTAC CGAAATGAAG AAAGCTTGCA ATTAGTGAAC 960 
ATGAAAGGAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 

SEQ ID NO:72 PDM8 Protein sequence: 
Protein Accession* NPJ06O925 



1 11 21 31 41 51 

I I I I I I 

MDETVAEFIK RTTLKIFMNE LTTILKAWDF LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 

AC ASISDAALLD IIYMQFHQHQ KVWDVFQMSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 

45 VSFRETEENA VWIRIAWGTQ YTKPNQYKPT YWYYSQTPY AFTSSSMLRR NTPLLGQELE 180 

ATGKIYLRQE EIILDITEMK KACN 

, SEQ 10 NO:73 PDM9 DNA SEQUENCE 

Nucleic Acid Accession*: NMJ016192 
50 Coding sequence: 1-1 125 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

_. _ ATGGTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTGC 60 

55 TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

/rn cctgtgtgtg GCTCCAATGG GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

60 TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGACGAAG ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

£c TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

Oj gtcatgtctt TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGG ACAACACTGT 900 

_ A GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

70 TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 
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gpQ ID N<?:74 PPMg Protein gggw^TKg; 

Protein Accession #: NPJW7276 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



I 

1 KVLWESPRQC 
61 DRENDLFLCD 
121 CKQQSEILW 
181 VCNIDCSQ1W 
241 HYARTDYAEN 
301 BKKDYSVLYV 
361 YSSDNTTRAS 



11 

I 

SSWTLCEGFC 
TNTCKFDQEC 
SEGSCATDAG 
FNPLCASDGK 
ANKLEESARE 
VPGFVRPQYV 
TRLI 



21 
I 

WLLLLFVMLL 
LRIGDTVTCV 
SGSGDGVHEG 
SYDNACQIKE 
HHIPCPEHYN 
LIAAVIGTIQ 



31 
I 

IVARPVKLAA 
CQFKCNNDYV 
SGETSQKETS 
A5CQKQEKXE 
GFCMHGKCEH 
IAVICWVLC 



41 

i 

FPTSLSDCQT 
PVCGSNGESY 
TCDICQFGAE 
VMSLGRCQDN 
SINHQEPSCR 
XTRKCPRSNR 



51 
I 

PTGWNCSGYD 
QNECYLRQAA 
CDEDAEDVWC 
TTTTTKSEDG 
CDAGYTGQHC 
IHRQKQNTGH 



SEQ ID NO:75 PD01 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0 14324 

Coding sequence: 89*1237 (underlined sequences correspond to start and stop codons) 



GGCGCCGGGA 
TTCCTTCAGC 
GTCCGGCCTG 
GGTACGCGTG 
CTCGCTAGTG 
GGTCGGATGT 
CAGAGATTCT 
AGTTCAGGAA 
TGTTCTCTCA 
TGACTTTGCT 
CACACGCACT 
AAGTTCTTTT 
CATGTTGGAT 
GGCTGTTGGA 
GTCTGATGAA 
TGCAGATGTA 
TGCCTGTGTG 
ACGGGGCTCG 
GTTAAACACC 
GGAGATACTT 
AATCATTGAA 
AATTTGAATA 
GAGGAACAGT 
CTACAGTGAT 
TGGGTACTTA 
TGATATTAAG 
TCTT6AA6AC 
AAATGCCACA 
GGCCTTTTGT 
TATCACACTT 
CT6AAAAAAA 
GGGACAGTCA 

TTCTGGATCT 
AAAAAAAAAA 



11 
I 

TTGGGAGGGC 
GGGGCACTGG 
GCCCCGGGCC 
GACCGGCCCG 
CTGGACCTGA 
GCTGCTGGAG 
GCAGCGGGAA 
AGCTTCTGCC 
AAAATTGGCA 
GGTGGTGGCC 
GACAAGGGTC 
CTGTGGAAAA 
6GTGGAGCAC 
GCAATAGAAC 
CTTCCCAATC 
TTTGCAAAGA 
ACTCCGGTTC 
TTTATCACCA 
CCAGCCATCC 
GAAGAATTTG 
AGTAATAAGG 
CTGCATTTAC 
ATTACAGTGT 
GATTGAATTC 
TACTAAATTA 
ATTCTTGACT 
ATCGATATAC 
AATTGTATGG 
CTTGGTGTTC 
TGTAATTTGC 
CATATCCAAA 
GTTTTAGGGT 
TCAGCTTTCC 
TATACCCAAC 
AAAAAAAAAA 



21 
I 

TTCTTGCAGG 
GAAGCGCCAT 
GTNTCTGTGC 
GCTCCCGCTA 
AGCAGCCGCG 
CCCTTCCGCC 
AATCCAAGGC 
GGTTAGCTGG 
GAAGTGGTGA 
TTATGTGTGC 
AGGTCATTGA 
CTCAGAAATC 
CTTTCTATAC 
CCCAGTTCTA 
AGATGAGCAC 
AGACGAAGGC 
TGACTTTTGA 
GTGAGGAGCA 
CTTCTTCCAA 
GATTCAGCCG 
TAAAAGCTAO 
AGTGTAGAGT 
OCTACCACTC 
TAAAAATGGT 
TGGTAGTTAT 
TATATTTTGA 
ATTTATTTAC 
TGATAAAAGT 
ATGATCTCCC 
AAAGAAAAGT 
ATAATGAGGA 
TGCCTGTATC 
TTTCTCCATG 
ACACAGCAAC 
AAAAAAAA 



31 
I 

CTGCTGGGCT 
GGCACTGCAG 
TATGGTCCTG 
CGACGT6AGC 
GGAGCCGCGT 
GCGGTGTCAT 
TTATTTATGC 
CCACGATATC 
GAATCCGTAT 
ACTGGGCATT 
TGCAAATATG 
GAGTCTGTGG 
GACTTACAGG 
CGAGCTGCTG 
GGATGATTGG 
AGAGTGGTGT 
GGAGGTTGTT 
GGACGTGAGC 
AGGGGATCCT 
AGAAGAGATT 
TCTCTAACTT 
AACACATAAC 
TAATCAAGAA 
TATCATTAGG 
TCTGCCTTCC 
ATGGGTTCTA 
ACTCTTGATT 
CACGTOAAAC 
TCTAAGCACA 
TTCACCTGTA 
AATGTGTTGG 
CAGTAACTCG 
TGTTTGATTT 
ATCCAGAAAT 



41 

I ' 

GGGGCTAAGG 
GGCATCTCGG 
GCTGACTTCG 
CGCTTGGGCC 
GCTGCGGCGT 
GGAGAAACTC 
CAGGCTGAGT 
AACTATTTGG 
GCCCCGCTGA 
ATAATGGCTC 
GTGGAAGGAA 
GAAGCACCTC 
ACAGCAGATG 
ATCAAAGGAC 
CCAGAAATGA 
CAAATCTTTG 
CATCATGATC 
CCCCGCCTTG 
TTCATAGGAG 
TATCAGCTTA 
CCAGGCCCAC 
ATTGTATGCA 
AAGAATTACA 
GCTTTTGATT 
AGTTTGCTTG 
GTGAAAAAGG 
CTACAATGTA 
AGAGTGATTG 
TTCCAAACTT 
TTGAATCAGA 
CTCACTACGT 
GGGCCTGTTT 
CTCCTCAGGC 
AAAGATCTCA 



51 

I 

GCTGCTCAGT 
TCGTGGAGCT 
GGGCGCGTGT 
GGGGCAAGCG 
CTGTGCAAGC 
CAGCTGGGCC 
GGATTTGGCC 
CTTTGTCAGG 
ATCTCGTGGC 
TTTTTGACCG 
CAGCATATTT 
GAGGACAGAA 
GGGAATTCAT 
TTGGACTAAA 
AGAAGAAGTT 
ACGGCACAGA 
ACAACAAGGA 
CACCTCTGCT 
AACACACTGA 
ACTCAGATAA 
GGCTCAAGTG 
TGGAAACATG 
GACTCTGATT 
TATAAAACTT 
ATATATTTGT 
AATGATATAT 
GAAAATGAGG 
GTTGCATCCA 
TAGCAACAGT 
ATGCCTTCAA 
AGAGTCCAGA 
CCCCGTGGGT 
TGGTAGCAAG 
GGACCCCCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQ ID NO:76 PDOI Protein sequence: 
Protein Accession #: NP.0551 39 

1 11 21 31 41 51 

I I.I I I I 

1 HALQGISWE LSGLAPGRXC AMVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 
61 REPRAAASVQ AVGCAAGALP PRCHGETPAG PRDSAAGKSK AYLCQAEWIW PVQESFCRLA 
121 GHDINYLALS GVLSKIGRSG ENPYAPLNLV ADFAGGGLHC ALGIIMALFD RTRTDKGQVI 
181 DANMVEGTAY LSSFLWKTQK SSLWEAPRGQ NMLDGGAPFY TTYRTADGEF MAVGAIEPQF 
241 YELLIKGLGL KSDELPNQMS TDDWPEMKKK FADVFAKKTK AEWCQIPDGT DACVTPVLTF 
301 EEWHHDHNK ERGSFITSEE QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEEFGFS 
361 REEIYQLNSD KIIESNKVKA SL 

SEQ ID N0:77 PD03 DNA SEQUENCE 

Nucleic Add Accession #: AB028951 

Coding sequence: 97-1 128 (underlined sequences correspond to start and stop codons) 



GTTAAATCCT 
CTTCACAGAG 
AGAGTCAAAA 
GCAGATTTGG 
GCAAGGCATT 
TTGACTTCGG 



11 

I 

TACTTTACCA 
ACTTGAAACC 
TAGCTGACAT 
ATCCAGTAGT 
ATACAAAGGC 
AACCTATTTT 



21 
I 

GATTCTTGAT 
AGCAAATATC 
GGGTTTTGCC 
TGTGACATTT 
CATTGATATA 
TCACTGTOGT 



41 



51 



31 

I I I 

GGTATCCATT ACCTCCATGC AAATTGGGTG 
CTAGTAATGG GAGAAGGTCC TGAGAGGGGG 
AGATTATTCA ATTCTCCTCT AAAGCCACTA 
TGGTATCGGG CTCCAGAACT TTTGCTTGGT 
TGGGCAATAQ GTTGTATATT TGCTGAATTG 
CAGGAAGATA TAAAAACAAG CAATCCCTTT 
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60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
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120 
180 
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CATCATGATC AACTGGATCO GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAGACTGG 420 
GAAGATATTA GAAAGATGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACO 480 
TATGCCAACA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TOACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAG 600 
3 CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GACCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAGA TTCCATACCC CAAACGAQAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAQCACCCA GACCAACGGG 840 
n ACCGCAGGTG GGGCTGGGGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 

10 GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT CCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 

15 CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
TACTGAGCAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTG ACTTCTCTGA TAAAGCOTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 

on TTTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 

20 AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT CTTAAACAGA AACTGTTACT GTCCCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AACCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 

25 GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 
ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCTCTAAAG AACCATTGGT 2100 

30 TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTGC ACTGAATGAT TTCTTTTGCC CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 

35 AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 

40 TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGOTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
. _ TCCATTTTTT AAAATAAGAA ATTAGCAGOC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 

45 TrrATCCTGT- GCCCTAAA6C CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAACCC 3240 

_ ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 

50 GTCAGTCTAC CTTAGAGAAA GCCAOTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCOC 3540 

_ _ TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 

55 CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 

- n CTTTCCTTTA AAGGGGAACA AAGCCTTTTT TTTTTTTGAG ACGGAGTGTT GCTCTGTCAC 3900 

60 CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTOP 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 

£c TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 

05 GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 

CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTT TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 

_ n CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 

70 AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 

GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTO 4740 

_ _ AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 

75 AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 

TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TGTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGG TCCC 5040 
CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 

oO ACGTGTTGTA CCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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5 
10 
15 
20 

25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TATGTATTAT 
TTAATACTAT 
ATCTGGACTG 
AGTATATCCT 
ACCTGTTCTT 
GAGAT6ACTG 
AGOGAGGCCT 
TTTTGAGTTG 
CATTTATTTT 



ATAAAAAAAA 
TTAATTTTTT 
AAGGTGTCCT 
TTCTAAACTG 
GTCTCTTTTT 
TAGCTTTTCG 
GCTCCATGGA 
ACCTGACTTC 
ATATTCTTGG 



AAACCCTTAA 
TAAAGATTTG 
TTTTAACAAC 
CCTAGTTTGT 
TCAGTCATTT 
TGCTCCACTG 
6TGCAGGACG 
CTTCTTGAAA 
TTGAAATAAA 



TGCACTGTTA 
TCTGTGTAGA 
AATTTAAAGT 
ATATTCCTAT 
TCTGCACGCA 
CGAGGTTTGT 
AGCTACTGCT 
TGACT6TTAA 
ATTTAATTGA 



TCTCCTAAAT 
CACTAAAAGT 
ACTTTTTATA 
AATTCCTATT 
TCCCCCTTTA 
GCTCAGAGCC 
TT00A6C6A6 
AACTAAAATA 
CTTTG 



ATTTA6TAAA 
ATTACACAAA 
TATGTTATGT 
TGTGAAGTGT 
TATGGTTATA 
GCTGCACCCC 
GGTTTCCTGC 
AATTACATTG 



5220 
5280 
5340 
5400 
5460 
5520 
5560 
5640 



SEQ ID N&78 PD03 Protein sequence: 
Protein Accession!: BAA82980 

1 11 21 31 41 51 

I I I I I I 

VKSLLYQILD GIHYLHANWV LKRDLKFANI LVMGEGPERG RVKIADMGFA RLFNSPLKPL 
ADLDPVWTF WYRAPELLLG ARHYTKAIDI WAIGCIFAEL LTSEPIFHCR QEDIKTSNPF 
HHDQLDRIFS VMGFPADKDW EDIRKMPEYP TLQKDFRRTT YANSSLIKYM EKHKVKPDSK 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA GCQIPYFKRE FLNEDDPEEK 
GDKNQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNG TAGGAGAGVG GTGAGLQHSQ 
DSSLNQVPPN KKPRLGPSGA NSGGPVMPSD YQHSSSRLNY QSSVQGSSQS QSTLGYSSSS 
QQSSQYHPSH QAHRY 

SEQ ID NO:79 POOS DNA SEQUENCE 

Nucleic Add Accession*: XML002922 

Coding sequence: 1-2190 {underlined sequences correspond to start and stop codons) 



60 

120 

180 
240 
300 
360 



1 
I 

ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTCTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAGTTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCACGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGC 
GAAGACTATG 
TGTAGAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACAGAGGATA 
AAACTAGAGA 




TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



31 

i 

GAAACTCTTT 
AAGAAGCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GGAATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAG 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



41 
I 

TTTCACCTGT 
CTCCGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AAGACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
CGGCAGCTGT 
TTTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



51 

I 

CTCCATTGAA 
CTGTCGCTCC 
CTTTTCCTAT 
GAATGAAGAT 
CATCCTGGGA 
CTCCTTGGTG 
AGGACAAGTG 
AGGAGGCATC 
AGAGGAACGG 
TTCTACATTT 
TGCATTGGCT 
GGGAAGCAAA 
ATGTATCTGG 
ACAGCACTGG 
GGCACTGACC 
TCAGCAGGGT 
TGTGCTTCAG 
GTTGTTTGAC 
TAGGAAAATG 
AGAGATAAAA 
AGTCTTGAAT 
TCTGTTGATA 
GAAAACAAAA 
TGAGCATTCT 
TATCTCCAGC 
GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGTGCAGCA 
GATTGAAGAC 
GGTTACAGCT 
TCCCTCTAGC 
TATCATCGTG 
GTTTTCCTGC 
TCCTGTAAAG 
GAACATGATC 



SEQ tD N0:80 PPQ5 Protein sequence; 
Protein Accession #: XP.002922 



MNPFQKNESK 
YGMKAVLILY 
YVI/3HVIKSL 
TRYFSVFYLS 
IYNKPPPEGN 
KVLFLYIPLP 
FVIYRLVSKC 
UVDDEVKVTV 



11 

I 

ETLFSPVSIE 
FLYFLHWNED 
GALPILGGQV 
INAGSLISTF 
IVAQVFKCIW 
HFWALLDQQG 
GINFSSLRKM 
VGWENNSLLI 



21 

I 

EVPPRPPSPP 
TSTSIYHAPS 
VHTVLSLIGL 
ITPMLRGDVQ 
FAISNRFKNR 
SRWTLQAIRM 
AVGMILACLA 
BSIKSFQKTP 



31 
I 

KKPSPTICGS 
SLCYPTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDIPKRQHW 
NRNLGFFVLQ 
FAVAAAVEIK 
HYSKLHLKTK 



41 



51 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



NYPLSIAFIV VNEFCERFSY 
AAIADSWLGK FKTIIYLSLV 
KPCVAAFGGD QFEEKHAEER 
FGVPGLLMVI ALWFAMGSK 
LDWAAEKYPK QLIMDVKALT 
PDQMQVLNPF LVLIFIPLFD 
IMEMAPAQSG PQEVFLQVLN 
SQDFHFHLKY HNLSLYTEHS 



60 
120 
180 
240 
300 
360 
420 
480 



331 



WO 02/30268 



PCT/US01/32045 



10 
15 



VQEKNWYSLV IREDGNSISS MMVKDTESKT TNGMTTVRFV NTLHKDVNIS LSTDTSLNVG 540 
EDYGVSAYRT VQRGEYPAVH CRTEDKNFSL NLGLLDFGAA YLFVITNNTO QGLQAWKIED 600 
IPANKMSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS MXSVLQAAWL LTIAVGNIIV 660 
LWAQFSGLV QWAEFILFSC LLLVICLIFS IMGYYYVPVK TEDMRGPADK HIPHIQGNMI 720 
KLETKKTKL 

SEQ ID N0-.81 P008 DNA SEQUENCE 

Nucleic Add Accession #: NM.020448 

Coding sequence: M221 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTGCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

AGCGCCGTAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGGCGC CCTCTTGGCG 120 

ATCTTCGGGC ACCTCGTGGT CAGCATTGCA CTTAACCTCC AGAAGTACTO CCACATCCGC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTG GCTGGGCCTG 240 

TTCCTGATGC TTCTCGGCGA GCTCGGTGTG TTCGCCTCCT ACGCCTTCGC GCCGCTGTCA 300 

CTCATCGTGC CCCTCAGCGC AGTTTCTGTG. ATAGCTAGTG CCATCATAGG AATCATATTC 360 

ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGTC CTTTGTTGGC 420 

20 TGCGOTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTTTCCT TTTGTACATG 540 

CTGGTGGAGA TCATTCTGTT CTGCTTGCTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

„ AAGGCCGTGG CTGGGATGCT TGTCTTGTCC ATTCAAGGGA ACCTGCAGCT TGACTACCCC 720 

2D ATCTTCTACG TGATGTTCGT GTGCATGGTG GCAACCGCCG TCTATCAGGC TGCGTTTTTG 780 

AGTCAAGCCT CACAGATGTA CGACTCCTCT TTGATTGCCA GTGTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

on ACGCGTAACA GGAAGAAGCG CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

3\J GGTATGCAGA ACATGCACGA TAAAGGGATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
GAGCACACCA AGAAGGA ATG A 

3 5 SEQ ID NO:82 PD06 Protein sequence 
Protein Accession!: NPJQ65181 

1 11 21 31 41 51 

AfX I I ! I I I 

4U MDGSHSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IPGHLWSIA LNLQKYCHIR 60 

LAG SKD PRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVPLSAVSV IASAIIGIIF 120 

IKEKWKPKDF LRRYVLSFVG CGLAWGTYL LVTFAFNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NIWILLLVA LLGSMTWTV KAVAGKLVLS IQGNLQLDYP 240 

AC IFYVMFVCMV ATAVYQAAFL SQASQMYDSS LIASVGYILS TTIAITAGAI FYLDPIGEDV 300 

45 LHICMFALGC LIAFLGVFLI TRNRKKPIPF EPYISMDAMP GMQNMHDKGM TVQPBLKASF 360 
SYGALENNBN ISEIYAPATL PVMQEEHGSR SASGVPYRVL EHTKKE 

SEQ ID NO:83 PD08 DNA SEQUENCE 

- Nucleic Add Accession*: NMJ>32712 

5U Coding sequence: 555-90B (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

i I I I I I 

cc CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATCCC TCCAGACACT 60 

55 CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAG GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTTGTGCCTC AGCTCTCGAG AGCTGCTGCT GCCGGGTGAC CTGATCCAAC CTGATAAGGT 300 

^ A GCCATCTTCA GCTACCACTO CAAGGCCCTG AGGGCAACAG CAGCACGGCA CTGCCCACCC 360 

U(J GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

/C - GCAGGCTGCT CTCCATGGTG CCAGGGCCCO CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

05 CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGGTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTGGGG GAGCTGCTCC 900 

— A TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

7U CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG CCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

OATCCCCCAG GCATCOTGTG CCATGTTGCA CTTCTGCCCA GGCAGCAGGG TGGGTGGGTA 1200 

CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 
75 ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

SEQ ID NO:84 PDQ8 Protein seouence 
Protein Accession *: NP_l 16101 

80 1 11 21 31 41 51 

332 



WO 02/30268 



I I I I I I 

MTVLEAVLEI QAITGSRLLS MVPQPARPPG SCWDPTQCTR TWLLSHTPRR RWISGLPRAS 60 
CRLGEEPPPL PVCDQAYGEE LSIRHRETWA WLSRTDTAWP GAPGVKQARI LGELLLV 

SEQ 10 NO:85 PDT1 DMA SEQUENCE 

Nucleic Acid Accession*: NMJJ00693 

Cooing sequence: 53-1591 (underlined sequences correspond to start and stop codorts) 

1 11 21 31 41 51 

E I I I I I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCOGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA GATATTTATC AACAATGAAT GGCACGAATC 180 

CAAGAGTGGG AAAAAGTTTG CTACATGTAA CCCTTCAACT CGGGAGCAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

GAGGGGCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTOT CTGTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAAGCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGGAGCAA TCTGAAGCGG GTGACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGCGGACGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGG GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCC CTG AA GGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CGGACGGCGG AATGTGGCAG ATGAAATGTG CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCTGCCTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGAGTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT TGTTTTTTGT TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGOCA GGGTCTTTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT • CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCATATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO;86 PDT1 PROTEIN SEQUENCE 

Protein Accession #: NP.000684 

l li 21 31 41 51 

I I I I I I 

MATANGAVEN GQPDGKPPAL PRPIRNLEVK FTKIFINNEW HBSK6GKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDALSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPPLHAF PIDLEGCIRT LRYPAGWADK IQGKTIPTDD NWCFTRHEP IGVCGAITPW 180 

NFPLLMLVWK IAPALCCGNT MVLKPAEQTP LTALYLGSLI KEAGPPPGW KIVPGPGPTV 240 

GAAISSHPQI NKIAFTGSTE VGKLVKEAAS RSNLKRVTLE LGGKNPCIVC ADADLDLAVE 300 
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CAHQjGVFFNQ GQCCTAASRV FVEEQVYSEF VRRSVEYAKK RPVGDPFDVK TEQGPQIDQK 360 

QPDKILELIE SGKKEGAKLE CGGSAMEDKG LFIKPTVPSB VTDNMRIAKE BIFGPVQPIL 420 

KFKSIEEVXK RANSTDYGLT AAVPTKNLDK ALKLASALES GTVWINCYNA LYAQAPFGGF 480 
^ KMSGNGRELG EYALAEYTEV KTVTIKLGDK NF 

SEQ ID N0:87 P0V3 DNA SEQUENCE 

Nucleic Acid Accession #: NM.032642 

Cooing sequence: 184-1263 (underlined sequences correspond to start and stop codons) 

10 i 11 21 31 41 51 

I I I I I ! 

GACCATTAGC AGGCACCCA6 GCCTGTCTTT GGCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

TAGTTTGAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

_ ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

1 J ACCATGCCCA GCCTGCTGCT GCTGTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACAGACG CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTCCCG GGCTCTCCCC TGGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

. ATCAAGOAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

20 GCATCTGTCT TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCAOGCG 540 

GTGAGCGCCG CGGGCGTGGT CAACGCCATC AGCCGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

r GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

25 CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAG6TCGGGG ACCGGCTGAA GGAGAAGTAC GACAGCGCGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACAGOCGC TTCACCCAGC CCAOCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG 1080 

30 CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATGGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

35 GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAGTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG GCTTTTTCTC TCCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCCTATGAA GGTGGCGGGA 1680 

40 ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

£ GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

45 CCCCAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCCGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGCGGGCCC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGGTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 



50 



SEQIONO:88E 
Protein Accession*; NP_1 16031 



1 11 21 31 41 51 

55 | | | | | | 

MPSLLLLFTA ALLSSWAQLL TDANSWWSLA LNFVQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 

LCQLYQEHMA YIGEGAKTGI KECQHQFRQR RWNCSTABNA SVFGRVMQIG SRETAFTHAV 120 

SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAKEFVDARE 180 

REKNFAKGSE EQGRVLMNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 

60 VGDRLKEKYD SAAAMRVTRK GRLELVNSRP TQPTPEDLVY VDPSPDYCLR NESTGSLGTQ 300 
GRLCNKTSEG MDGCELKCCG RGYNQFKSVQ VERCHCKFHW CCFVRCKKCT EIVDQYICK- 

SEQ ID NO:89 PDT9 DNA SEQUENCE 

Nucleic Add Accession*: NM.033280 
65 Coding sequence: 5d-636 (underlined sequences correspond to start end stop codons) 

1 11 21 31 41 51 

I I 1 I I I 

_ n GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATG 60 

70 GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

nc ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

75 CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGOAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

_ A ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGACCA 660 

80 GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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TGTATAAAAO GGAACAGTGT GGAGATGTTT TTGTCTTGTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ ID NQ:9Q PDT9 Protein seouenca 
Protein Accession*: NP_l 50596 

1 11 21 31 41 51 

1 I I I I I 

MVRAGAVGAH LPASGLDIFG DLKKMNKRQL YYQVLNFAMI VSSALMIWKG LIVLTGSESP 60 
ZVWLSGSME PAFHRGDLLF LTNFREDPIR AGEIWFKVE GRDIPIVHRV IKVHEKDNGD 120 
IKFLTKGDNN BVDDRGLYKE GQNWLEKKDV VGRARGFLPY VGMVTIIMND YPKFKYALLA 180 
VMGAYVLLKR ES 

SEQ 10 N0:91 PDV5 DNA SEQUENCE 

Nucleic Acfd Accession #: NM.016590 

Coding sequence: 691 -975 (underlined sequences correspond to start and stop codons) 



GATTACTCAC 
CGTGTCAGAA 
TACGCCAACT 
CCAGAGCCAG 
AGGGCTGCAC 
CACTTTGCCT 
GGTGATCTGG 
CAGCTGTGCC 
CAAACGCCTG 
CAGGTTGTTA 
TGGAAAACAA 
AAAATTTCAT 
AGCCAACAAT 
TTTCCCATAG 
GTCACCAGCA 
TCTGTCAATA 
AAAAGTTTTG 
GAACACATTC 
ATAATAATCA 
ATCTTTAGTC 
GATGTCCCAT 
AGCCTGTGCA 
ATCTTCTTTT 
CTCAATTCAT 
CGATCAATTA 
AAGGGTCACC 
AAACAAGAAT 
TGAACCAAGC 
TCTCAAAATC 
ATTTACGTAA 
CTGTTAGTAT 
GATAAAGAGC 
GAACCAGTTT 
CTCATAGTGT 
CTCATCTTAA 
AAAAAAAA 



11 
I 

ACAGTCTTGA 
CTCAATTACG 
ATAGGACTCG 
CCAATCACTT 
TGGAACAACA 
CTAAAGGCCA 
GGAAAACGCA 
GGCGCTACGG 
AGTGCTGCTG 
GAATTGTTAC 
GCCCTTTATT 
TAAAATCCCC 
ACTCTAAACT 
GAAGACTTCA 
ACCATCCGCA 
TACAACTGAG 
TTTGACTCAA 
ATGATGAGAA 
TGATAACCTG 
CTTGGAGCTG 
TATTATCCAC 
AAACAAAGCA 
GCCTAAAATT 
CAGGACTTGT 
ATGTTTTCTG 
CAAATAGCTG 
TAAGATGATC 
ACTGTCAGCA 
TGGGCCAAGA 
GCCAAAGAAA 
GTCAAATCAA 
CCCATTATTT 
CCTGGTAGGG 
AAACTGAAAG 
CAAAGCAAAA 



21 

I 

AGATGCAATG 
ACTACATATG 
TGCTTCTCGT 
AGCTCCTCAT 
CAGATGAGAT 
GAGAAAAATC 
GCTACACCTG 
GACCCGAGCC 
CCTTCGGTGA 
CCCCTTTACT 
GAATTTTCAA 
TTGAACTCCC 
GAGGCCTGCA 
CCTCCTACAA 
GTCATTCAAG 
TTACAGACTG 
CTTCAAGCTG 
CTTTCTAAAA 
AAACATGTTA 
TCACATAGCA 
CCTGAGCCAC 
ATGGAAAAGG 
ACTAATGCAC 
ATTAGCAGGT 
GTGATCACAT 
AGTGCAGTCC 
CCAATAAAAG 
AATCTCAOGT 
ATGATTGCTA 
GTCACTCATG 
CTAAGACTGG 
TCACAGTGCC 
AACTGCTGAC 
AAAAATAGTT 
AAAAATGCTT 



Protein Accession* NP_057674 



11 



21 



31 
I 

TCAGCTATTT 
CATTAAGGCA 
ACGCTGGGCT 
AACAAGTCTA 
ATTCTACACA 
ACAGCTTCCT 
GAGCAAGGTC 
GTCCCAGAAA 
CTATATGAGA 
CAGAGATAAC 
CACAGACTCC 
ATGTTCAAAT 
AGTCATTTCA 
CTCCGAAGAA 
TGGAAGCTTT 
TCCCCTGGCT 
CTCATCTGTT 
GACCAGCACT 
CTGGGACTCG 
GGGGCAACCT 
CATAATATGC 
AAACTAAAAA 
CACGTCAGTC 
TCTGGCTAGA 
CAGGCCCTAT 
TTGCTCATAT 
AAAAATTGCT 
ATTAGAGCAA 
GGTCCATAAG 
AGTAAACTAT 
CAGGGTATTA 
AGCCTCTACC 
AGTTTCAATG 
GCTTTTTAAA 
TAATTCAAAT 



31 




CCAAAGGGCA 
ATGGAAACTT 
ATAGATTATC 
CTGCTTCTCA 
CTCCATTTGT 
TTTGTATTTT 
AACCCTTACT 
CACAGCTTTT 
CCCTGACCCT 
AGTAAGTGAT 
GCTCTTCCCC 
ACATTTTTCT 
CACACTGAAA 
TGTTTACATT 
ATATACATAC 
TGCTTCCTTC 
GAGACTATCT 
CTAAGAAGCT 
TTCCTTCATC 
CAGGAAACTG 
CTATGGTTGA 
CTAATTTGTC 
AGAAAACGTT 
ACTCCATTCC 
TAAGGAAACC 
CTGACAGTTG 
ATGTCAGCAA 
TAAAAATCAT 



51 

i 

CATOCAAGGC 
GGCCTCAGGG 
AAACTGAGCT 
GAAAGCTGAA 
TATCTGGAAT 
GAAAAGGACA 
TTGGCAATCT 
GGCACGGCAG 
CTAAGGAAGC 
CAGGCTGAGA 
TCTCCTTAAT 
TGACAGACAA 
TGTCCAGAAA 
GTCCAAGACC 
GTACATTCTC 
TACAAACACT 
GTTCACTCCA 
TCCTATAATC 
GGGGATTGAA 
CAAAGGAAGT 
TATTTTCTTC 
TAGTACCATT 
AGGCATCATT 
CCTGTCATCA 
CATGGTATAC 
TTAACCCCGC 
AACCTTTTTC 
TTOAAAAGTG 
TGGCCTTGCC 
CAGACCCATC 
AGGTGACATG 
CTAGACCTTG 
GAGCCAATGC 
GAAGGCCTGC 
GATACTAAAA 



51 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 



41 

I I I I I I 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV IXVRWAIIYE TELQSQPIT 

SEQ ID NO:93 PEE6 DNA SEQUENCE 

Nucleic Acid Accession*: NH.002606 

Cooing sequence: 61-1642 (underlined sequences correspond to start and stop codons) 



CGCGGCGGCT 
ATGGGATCCG 
ATTCAGAAGG 
ATCGCCAOCG 
QTCTCCATCG 
GTGGCCATCA 
TCTGCTGAGA 
GGAGCATTTG 
GAAGGCCAGC 
GCAGAGCAGT 
TTGGCTGTCC 



11 

I 

GGCGTCGGGA 
GCTCCTCCAG 
TAATCTTCAG 
GCCTGCCTCG 
ACCCCACCAT 
AGCAACTCTC 
GACCACTGAG 
AAAGTGGACA 
GCATCCCTCC 
TCTCAAGAGC 
TAGAGAAACG 



21 
I 

AAGTACAGTA 
CTACCGGCCC 
CAAGTACTGC 
GAACACGACC 
GCCCGCGAAT 
CGCTGGTGTC 
GGACAGACGG 
GGTAGAGCCC 
AGAGAGAGAA 
ATTCAAAATC 
CGTGGAATTG 



31 

I 

AAAAGTCCGA 
AAGGCCATCT 
AACTCCAGCG 
ATCTCCCTGC 
TCAGAACGCA 
GAGGACAAGA 
GTTGTGGGCC 
AGGCCCAGAG 
GAATTAATCC 
AATGAACTGA 
GAAGGACTAA 



41 
I 

GTGCAGCCGC 
ACCTGGACAT 
ACATCATGGA 
TGACCACCGA 
CTCCGTACAA 
GAACCACAAQ 
TGGAGCAGCC 
AGCCCCAGGG 
AGAGCGTGCT 
AAGCTGAAGT 
AAGTGGTGGA 



51 
I 

CGGGCGCAGG 
CGATGGACGC 
CCTGTTCTGC 
CGACGCCATG 
AGTGAGACCT 

ccgtcgccag 
ccggagggaa 
ctgctaccag 
ggcgcaggtt 
tgcaaatcac 
gattgagaaa' 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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TGCAAGAGTG ACATTAAGAA GATGAGGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TGCCCCTGTA AGTACAGTTT TTTGOATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

CCCACTTACC CCAAGTACCT GCTCTCTCCA GAGACCATCG AGGCCCTGCG GAAGCCGACC 840 

- TTTGACGTCT GGCTTTGGGA GCCCAATGAG ATGCTGAGCT GCCTGGAGCA CATGTACCAC 900 
5 GACCTCGGGC TGGTCAGGGA CTTCAGCATC AACCCTGTCA CCCTCAGGAG GTGGCTGTTC 960 

TGTGTCCACG ACAACTACAG AAACAACCCC TTCCACAACT TCCGGCACTG CTTCTGCGTG 1020 

GCCCAGATGA TGTACAGCAT GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

GATATCCTGA TCCTAATGAC AGCGGCCATC TGCCACGATC TGGACCATCC CGGCTACAAC 1140 

t AACACGTACC AGATCAATGC CCGCACAGAG CTGGCGGTCC GCTACAATGA CATCTCACCG 1200 

10 CTGGAGAACC ACCACTGCGC CGTGGCCTTC CAGATCCTCG CCGAGCCTGA GTGCAACATC 1260 

TTCTCCAACA TCCCACCTGA TGGGTTCAAG CAGATCCGAC AGGGAATGAT CACATTAATC 1320 

TTGGCCACTG ACATGGCAAG ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

AATTTTGACT ACAGCAACGA GGAGCACATG ACCCTGCTGA AGATGATTTT GATAAAATGC 1440 

- TGTGATATCT CTAACGAGGT CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 
15 TTAGAGGAAT ATTTTATGCA GAGCGACCGT GAGAAGTCAG AAGGCCTTCC TGTGGCACCG 1560 

TTCATGGACC GAGACAAAGT GACCAAGGCC ACAGCCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

CTGATCCCAA TGTTTGAAAC AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

CAGCCACTTT GGGAATCCCG AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

AAAGAGTTAC AGAAGAAGAC TGACAGCTTG ACGTCTGGGG CCACCGAGAA GTCCAGAGAG 1800 

20 AGAAGCAGAG ATGTGAAAAA CAGTGAAGGA GACTGTGC CT GAG GAAAGCG GGGGGCGTGG 1860 

CTGCAGTTCT GGACGGGCTG GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

TGGGCACCTG GCACCACAAG ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 
AAAAAAAAAA A 

25 SEQ ID NO;94 PEE6 Protein sequence 
Protein Accession fr. NP„002597 

1 11 21 31 41 51 

an I 1 1 1 1 1 

JO MGSGSSSYRP KAIYLDIDGR IQKVIFSKYC NSSDIMDLFC IATGLPRNTT ISLLTTDDAM 60 

VSIDPTMPAN SERTPYKVRP VAIKQLSAGV EDKRTTSRGQ SAERPLRDRR WGLEQPRRB 120 

GAFESGQVEP RPREPQGCYQ EGQRIPPERE ELIQSVLAQV AEQFSRAFKX NELKAEVANH 180 

LAVLEKRVEL EGLKWEIEK CKSDIKKMRE ELAARSSRTN CPCKYSFLDN HKKLTPRRDV 240 

PTYPKYLLSP ETIEALRKPT FDVWLWEPNE HLSCLEHMYH DLGLVRDPSI NPVTLRRWLF 300 

35 CVHDNYRNNP FHNFRHCFCV AQMMYSMVWL CSLQEKFSQT DILILMTAAI CHDLDHPGYN 360 

NTYQINARTE LAVRYNEISP LENHHCAVAF QILAEPECNI FSNIPPDGFK QIRQGMITLI 420 

LATDMARHAE IMDSFKEKME NFDYSNEEKH TLLKMILIKC CDISNEVRPM EVAEPWVDCL 480 

LEEYFMQSDR EKSEGLPVAP FMDRDKVTKA TAQIGFIKFV LIPMFETVTK LFFMVEEIKL 540 
QPLWESRDRY EELKRIDDAM KELQKKTDSL TSGATEKSRE RSRDVKNSEG DCA 

40 

SEQ ID HO:95 PEG4 DNA SEQUENCE 

Nucleic Acid Accession #: none 

Coding sequence: 41-559 {underlined sequences correspond to start and slop codons) 

45 1 11 21 31 41 51 

I I I I I 1 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

„ TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

50 TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAOACTCC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGO AAGTCACCCT GCAGCCAGGC GAATACATCA CAAAAGTCTT 300 

TGTCGCCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

r _ GCTGGTGGGC ATCTATGGCC AGTATCAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

55 GAATTATCCA CTAGAGGAGC CGACCACTGA GCCACCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGQ GTGGGGTATG GGGCCATCCG AGCTGAGGCC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 

60 SEQ ID NO:96 P.EQ4 PrPlglP sequence 

Protein Accession #: R3ENESH predicted 

1 n 21 31 41 51 

I I I I I I 

05 MLLLLTLALL GGPTWAGKMY GPGGGKYPST TEDYDHEITG LRVSVGLLLV KSVQVKLGDS 60 

WDVKLGALGG NTQEVTLQPG EYITKVFVAF QAFLRGMVMY TSKDRYFYFG KLDGQISSAY 120 

PSQEGQVLVG IYGQYQLLGI KSIGFEWNYP LEEPTTEPPV NLTYSANSPV GR 

70 SEQ ID N0:97 PELS DNA SEQUENCE 

Nucleic Acid Accession*: NM_006953 

Coding sequence: 33-89 6(undertlned sequences correspond to start and stop codons) 
„ 1 11 21 31 41 51 

75 i i I I I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCCGGG CGATGCCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TCGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGCCACCAA CAACCCCACA CTTACCACTG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

0rt ACAGCAAAGA GGCCCTCACT GGCACCCACG AGGTCTACCT GTATGTCCTG GTCGACTCAG 240 

80 CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GCAGTGACCT GCCCAGCCTG GATGCCATTG GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAACG GGACCTGCCT GTGGGATCCC AACTTCCAGG 480 

GCCTCTGTAA CGCACCCCTG TCGGCA6CCA CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

ATATGTCCAC GGGCTTGGTA GAGGACCAGA CCCTOTGGTC GGACCCCATC CGCACCAACC 600 

AGCTCACCCC ATACTCGACG ATCGACACGT GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCATCACTTC CATCCTGGGC TCCCTGCCCT TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

TTGCCCTCAG CCTCGTGGAC ATGGGGAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 780 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

ACCGGGGGCC GCCACTGGAC AGGGCTGAGG TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCACCACCC CTGGGCAGCA GCATCCTCCT CTCTGGCCTT GCCCCAGGCC CTGCAGCGGT 960 

GGTTGTCACA CCCTGACTTC AGGGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 



1 5 SEQ ID NO;98 PEL9 Protein sequence 
Protein Accession #: NP_008884 



1 11 21 31 41 51 

I I I I I I 

MPPLWALLAL GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKEALTGTHE 60 

VYLYVLVDSA ISRNASVQDS TNTPLGSTFL QTEGGRTGPY KAVAPDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVRVGANG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN MSTGLVEDQT 180 

LWSDPIRTNQ LTPYSTIDTW PGRRSGGMXV ITSILGSLFF FLLVGFAGAI ALSLVDMGSS 240 
DGETTHDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 

SEQ ID N&99 PEN1 DMA SEQUENCE 

Nucleic Acid Accession*: NM_0 12391 

Coding sequence: 416-1423 (underlined sequences correspond to start anrj stop codons) 



1 11 21 31 41 51 

I I I I ! I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 

TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGOCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATGGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT QACCTTGGAO GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGCCC CAQCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCGCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGCCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

cc CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

55 CATCTCCCAG CGCCTCGTCT ACCAGTTCGT GGACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTOCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

£ri GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTCCC 1680 

60 CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCOCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 

65 SEQ ID MO:10Q PEN1 Protein sequence 
Protein Accession #: NPJ036523 

1 11 21 31 41 51 

I I I I I I 

MGSASPGLSS VSPSHLLLPP DTVSRTGLEK AAAGAVGLER RDWSPSPPAT PEQGLSAPYL 60 

SYFDMLYPED SSWAAKAPGA SSREEPPEEP EQCPVIDSQA PAGSLDLVPG GLTLEEHSLE 120 

QVQSMWGEV LKDIETACKL LNITADPMDW SPSNVQKWLL WTEHQYRLPP KGKAFQELAG 180 

KELCAMSEBQ FRQRSPLGGD VLHAHLDIWK SAAWMKERTS PGAIHYCAST SEESWTDSEV 240 

DSSCSGQPIH LWQPLKELLL KPHSYGRFIR WLNKEKGIFK IEDSAQVARL WGIRKNRPAM 300 
NYDKLSRSIR QYYKKGIIRK PDISQRLVYQ PVHPI 



SEQ ID KO:101 PEN3 DNA SEQUENCE 

Nucleic Add Accessions NMJM0742 

Codng sequence: 555-2144 (underflned sequences correspond to start and stop codons) 
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GAGAGAACAG 
GCTTGGGTTT 
CTGCATGAAG 
AGAGCTTGCC 
GGGGTGTCTC 
GCTCTATTCT 
CCAAGCCAGG 
TCGGTGGTGA 
TCT6CTGGGG 
GCTCAGGAGA 
TGTGGTGGCT 
CTCCTGGAGA 
CCGAGACTGA 
CGGTGCCCAA 
TCGATGTGGA 
GCGACTACAA 
CTTCTGAGAT 
CAGTGACCCA 
CGGCCATCTA 
ACTGCAAGAT 
TGGAGCAGAC 
CCACGGGCAC 
CCTACGCCTT 
GCCTGCTCAT 
AGATCACGCT 
AGATCATCCC 
TGATCTTCGT 
CCCCCAGCAC 
GGTGGCTTCT 
AGCTCAGCCC 
TGGTGGAGGA 
TCTGCAGCCA 
AGGAGGGTGA 
TTGCCGACCA 
TTGCCATGGT 
CCATCGGCCT 
CTGGCTCCCA 
ATTTGGAGAT 
CCAGGTGAGG 
GGGTGCTGAG 
GCGGGAGGCA 
ATGGATGGTT 
CCAGGCTTCT 
CGGCCCCCAG 
TACGCGTGCA 



11 

I 

CGTGAGCCTG 
CACCTGCAGA 
CCGTTCTGGC 
CAGCTGTCCC 
CTAAAOCCTC 
GTACCTGCCA 
CTGGTTCTCT 
GAGGAAGCCT 
ACATGGTCCA 
AGCCATGGGC 
CCTTCTGACC 
CCCACTCTCC 
GGACCGGCTC 
CACTTCAGAC 
TGAGAAGAAC 
ACTGCGCTGG 
GATCTGGATC 
CATGACCAAG 
CAAGAGCTCC 
GAAGTTTGGC 
TGTGGACCTG 
CTACAACAGC 
CGTCATCCGG 
CTCCTGCCTC 
GTGCATTTCG 
GTCCACCTCG 
CACCCTGTCC 
CCACACCATG 
GATGAACCGG 
CTCTTATCAC 
GGAGGACAGA 
CGGCCACCTG 
GCTGCTGCTA 
CCTGCGGTCT 
CATCGACAGG 
CTTTCTGCCT 
GGGCAAAGGG 
GAGCCCAAAG 
TCTCTCTAAG 
CTGTATGGTC 
GGCCTGCACC 
GGATACAGGT 
CCTTGACGTC 
GAGGTCTGGC 
GCAGGCAAAC 



21 
I 

TGTGCTTGTG 
ATCGCTTGTG 
TGCCAGAGCT 
CGGGAAGCCA 
ACTCTTCAGC 
CTCTATTTCT 
GCATCCTTTC 
CGCAGAATCC 
TGGTGCAACC 
CCCTCCTGTC 
CCAGCAGGTG 
TCTCCCAGTC 
TTCAAACACC 
GTGGTGATTG 
CAAATGATGA 
AACCCCGCTG 
CCCGACATTG 
GCCGACCTCT 
TGCAGCATCG 
TCCTGGACTT 
AAGGACTACT 
AAGAAGTACG 
CGGCTGCCGC 
ACTGTGCTGG 
GTGCTGCTGT 
CTGGTCATCC 
ATCGTCATCA 
CCCCACTGGG 
CCCCCACCAC 
TGGCTGGAGA 
TGGGCATGTG 
CACTCTGGGG 
TCACCCCACA 
GAGGATGCTG 
ATCTTCCTCT 
CCGTTCCTAG 
GAGGGTTCTT 
TGCCAGGGAG 
TCAGGCTGGG 
CAGCAGGGGA 
TGATGTGGAG 
GGCTGGGCTA 
ATTCCTCTCC 
AGAGCTGAGA 
AAGA 



31 

I 

TGCTGAGCCC 
CTGGGCTGCC 
GGACAGCCOC 
AATGCCTCTC 
CTCTGTTTGA 
GGGGTGACTT 
AATGACCTGT 
AGCAGAATCC 
CACAGCAAAG 
CTGTGTTCCT 
GAGAGGAAGC 
CCACGGCATT 
TCTTCCGGGG 
TGCGCTTTGG 
CCACCAACGT 
ATTTTGGCAA 
TTCTCTACAA 
TCTCCACGGG 
ACGTCACCTT 
ATGACAAGGC 
GGGAGAGCGG 
ACTGCTGCGC 
TCTTCTACAC 
TCTTCTACCT 
CACTCACCGT 
CGCTCATCGG 
CCGTCTTCGT 
TGCGGGGGGC 
CCGTGGAGCT 
GCAACGTGGA 
CAGGTCATGT 
CCTCAGGTCC 
TGCAGAAGGC 
ACTCTTCGGT 
GGCTGTTTAT 
CTGGAATGAT 
GGATGTGGAA 
AACAGCCAGG 
GTTGAAGTTT 
GTAATAAGGG 
GTACAGGCAG 
TTCCATCCAT 
TTCCTTGCTG 
GCCATGGCCT 



41 
I 

TCATCCCCTC 
TGGGCTGTCC 
AGGAAAACCC 
ATGTAAGTCT 
CCATGAAATG 
TTGTCAGCTG 
TTTCTTCTGT 
TCACAGAATC 
CCCTGACCTG 
GTCCTTCACA 
TAAGCGCCCA 
GCCGCAGGGA 
CTACAACCGC 
ACTGTCCATC 
CTGGCTAAAA 
CATCACATCT 
CAATGCAGAT 
CACTGTGCAC 
CTTCCCCTTC 
CAAGATCGAC 
CGAGTGGGCC 
CGAGATCTAC 
CATCAACCTC 
GCCCTCCGAC 
CTTCCTGCTG 
CGAGTACCTG 
GCTCAATGTG 
CCTTCTGGGC 
CTGCCACCCC 
TGCCGAGGAG 
GGCCCCCTCT 
CAAGGCTGAG 
ACTGGAAGGT 
GAAGGAGGAC 
CATCGTCTGC 
CTGAC TGCAC 
GGGCTTTGAA 
TGAGGTGGGA 
GGAGTCTGTC 
CTCTTCCGGA 
ATCTTCCCTA 
CTGGAAGCAC 
CAAAATGGCT 
GCAGGGGCTC 



m TP WOTS ?m Pffjtf n WWW 

Protein Accession #; NP_000733 
1 11 



41 



21 31 
I I I I I 

MGPSCPVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP LSSPSPTALP 
RLFKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLIDVDS KNQKMTTNVW 
RWNPADPGNI TSLRVPSEMI WIPDIVLYKN ADGBPAVTHM TKAHLFSTGT 
SSCSIDVTPF PFDQQNCKMK FGSWTYDKAK IDLEQMEQTV DLKDYWESGE 
NSKKYDCCAE IYPDVTYAFV IRRLPLFYTI NLIIPCLLIS CLTVLVFYLP 
ISVLLSLTVF LLLITEIIPS TSLVIPtlGB YLLFTMIFVT LSIVITVFVL 
TMPHWVRGAL LGCVPRWLLM NRPPPPVELC HPLRLKLSPS YHWLESNVDA 
DRWACAGHVA PSVGTLCSHG HLHSGASGPK AEALLQEGEL LLSPHMQKAL 
RSEDADSSVK EDWKYVAMVI DRIFLWLHI VCFLGTIOLF LPPFLAGMI 



51 

I 

CTGGGGCCAG 
TCAGTGGCAC 
ACCTCTCTGC 
TCTGCTCGAC 
AAGTGACTGA 
CCCAGAATCT 
AACCACAGGT 
CAGCAGCAGC 
ACCTCCTGAT 
AAGCTCAGCC 
CCTCCCAGGG 
GGCTCGCATA 
TGGGCGCGCC 
GCTCAGCTCA 
CAGGAGTGGA 
CTCAGGGTCC 
GGGGAGTTTG 
TGGGTGCCCC 
GACCAGCAGA 
CTGGAGCAGA 
ATCGTCAATG 
CCCGACGTCA 
ATCATCCCCT 
TGCGGCGAGA 
CTCATCACTG 
CTGTTCACCA 
CACCACCGCT 
TGTGTGCCCC 
CTACGCCTGA 
AGGGAGGTGG 
GTGGGCACCC 
GCTCTGCTGC 
GTGCACTACA 
TGGAAGTATG 
TTCCTGGGGA 
CTCCCTCGAG 
CAATGTTTAG 
GGTTGGAGAG 
CGAGTTTGCA 
AGGGGAGGAA 
CCGGGGAGGG 
ATTTGAGCCT 
CTGCACCAGC 
CATATGTCCC 



51 

I 

QGGSHTETED 
LKQEWSDYKL 
VHWVPPAIYK 
WAIVNATGTY 
SDCGEKITLC 
NVHHRSPSTH 
EEREVWEEE 
EGVHYIADHL 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 



60 
120 
180 
240 
300 
360 
420 
480 



SEQ ID NO:103 PEU4 DMA SEQUENCE 

Nucleic Acid Accession*: NMJ) 18670 

Coding sequence: 87-693 (underlined sequences correspond to start and stop codons) 



CACGAGGCTG 
CGGCCCCCAG 
CCTGGATGCT 
GCGGCCGCTC 
TGGCGAGCCC 
GCGGCGCGCG 
AACTGCGCAT 
CCGTGGCGCC 
ATATCGGCCA 
GGCAGCGCGG 
CGCAGATGCA 



11 
I 

GAAGGGGCCA 
ACGCGCCGCC 
CTCTGCGGCC 
CCTCGTCTCG 
CGCGCGGCCA 
CAGCAGCCGC 
GCGCACGCTG 
CGCGGGCCAG 
CCTGTCGGCC 
TGACGCGGGG 
GACACGGACG 



21 

I 

CTTCACACCT 
GCTGCCATGG 
TGGGGCCCAA 
TCCCCAGACT 
GGCACCCTCC 
CTGGGCAGCG 
GCCCGCGCCC 
AGCCTGACCA 
GTGCTAGGCC 
TCCCCTCGGG 
CAGGCTGAGG 



31 
I 

CGGGCTCGGC 
CCCAGCCCCT 
CTCGGCGGCC 
CATGGGGCAG 
GGGACCCCCG 
GGCAGAGGCA 
TGCACGAGCT 
AGATCGAGAC 
TCAGCGAGGA 
GCTGCCCGCT 
GGCAGGGGCA 



41 
I 

ATAAAGCGGC 
GtGCCCGCCG 
GCCGCCCTCC 
CACCCCAGCC 
CGCCCCCTCC 
GAGCGCCAGT 
GCGCCGCTTT 
GCTGCGCCTG 
GAGTCTCCAG 
GTGCCCCGAC 
GGGGCGCGGG 

338 



si 
I 

CGCCGGCCGC 
CTCTCCGAGT 
GACAAGGACT 
GACAGCCCCG 
GTAGGTAGGC 
GAGCGGGAGA 
CTACCGCCGT 
GCTATCCGCT 
CGCCGGTGCC 
GACTGCCCCQ 
CTGGGCCTGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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10 



20 
25 
30 
35 



TATCCGCCGT CCGCGCCGGG GCGTCCTGGG GATCCCCGCC TGCCTGCCCC GGAGCCCGAG 720 

CTGCACCCGA GCCGOGCGAC CCGCCTGCGC TGTTCGCCGA GGCGGCGTGC CCGGAAGGGC 780 

AGGCGATGGA GCCAAGCCCA CCGTCCCCGC TCCTTCCGGG CGACGTGCTG GCTCTGTTGG 840 

AGACCTGGAT GCCCCTCTCG CCTCTGGAGT GGCTGCCTGA GGAGCCCAAG TGACAAGGGA 900 

CAACTGACGC CGTCTCTGTG AGCACCGAGG CTTTTTGGCC TCAGCACCTT CGAAGTGGTT 960 

CCTTGGCAGA CTGCCTTTCC TGGAAGAGGG CACGGGCGAT CCCGACGGGG GCATTCCTGC 1020 

GGGTGAGAGC CGTCCCCACC GCGGCGGCCC TTCTCAGCCC CTCCCTCCAT GGAGGGACCC 1080 

ATAGGGCTAG ACACTTTGAG GCAAGCAGGA GGCTCTGCCT AATGTGAATT TATTTATTTG 1140 
TGAATAAACT GTACTGGTGT CAAAAAAAAA AAAAAAAAAA A 

SEQ ID KO:104 PEU4 Proteh seouence 
Protein Accession #: NP_061 140 



1 11 21 31 41 51 

15 ! [ | | I | 

MAQPLCFFLS ESWMLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS PVASPARPGT 60 

LRDPKAPSVG RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 120 

TKIETLRLAI RYIGKLSAVL GLSEESLGRR CRQRGDAGSP RGCPLCPDDC PAQMQTRTQA 180 

EGQGQGRGLG LVSAVRAGAS WGSPPACPGA RAAPEPRDPP ALFAEAACPB GQAMEPSPPS 240 
PLLPGDVLAL LETWKPLSPL EWLPEEPK 



SEQ ID NO:10S PEU5 DNA SEQUENCE 

Kudeic Acid Accession #: KM_017636 

Coding sequence: 324-3374 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

i E I I I t 

CCACGGAGAA GCCCACCGAT GCCTACGGAG AGCTGGACTT CACGGGGGCC GGCCGCAAGC 60 

ACAGCAATTT CCTCCGGCTC TCTGACCGAA CGGATCCAGC TGCAGTTTAT AGTCTGGTCA 120 

CACGCACATG GGGCTTCCGT GCCCCGAACC TGGTGGTGTC AGTGCTGGGG GGATCGGGGG 180 

GCCCCGTCCT CCAGACCTGG CTGCAGGACC TGCTGCGTCG TGGGCTGGTG CGGGCTGCCC 240 

AGAGCACAGG AGCCTGGATT GTCACTGGGG GTCTGCACAC GGGCATCGGC CGGCATGTTG 300 

GTGTGGCTGT AGGGGACCAT CAGATGGCCA GCACTGGGGG CACCAAGGTG GTGGCCATGG 360 

GTGTGGCCCC CTGGGGTGTG GTCCGGAATA GAGACACCCT CATCAACCCC AAGGGCTCGT 420 

TCCCTGCGAG GTACCGGTGG CGCGGTGACC CGGAGGACGG GGTCCAGTTT CCCCTGGACT 480 

ACAACTACTC GGCCTTCTTC CTGGTGGACG ACGGCACACA CGGCTGCCTG GGGGGCGAGA 540 

ACCGCTTCCG CTTGCGCCTG GAGTCCTACA TCTCACAGCA GAAGACGGGC GTGGGAGGGA 600 

CTGGAATTGA CATCCCTGTC CTGCTCCTCC TGATTGATGG TGATGAGAAG ATGTTGACGC 660 

. _ GAATAGAGAA CGCCACCCAG GCTCAGCTCC CATGTCTCCT CGTGGCTGGC TCAGGGGGAG 720 

40 CTGCGGACTG CCTGGCGGAG ACCCTGGAAG ACACTCTGGC CCCAGGGAGT GGGGGAGCCA 780 

GGCAAGGCGA AGCCCGAGAT CGAATCAGGC GTTTCTTTCC CAAAGGGGAC CTTGAGGTCC 840 

TGCAGGCCCA GGTGGAGAGG ATTATGACCC GGAAGGAGCT CCTGACAGTC TATTCTTCTG 900 

AGGATGGGTC TGAGGAATTC GAGACCATAG TTTTGAAGGC OCTTGTGAAG GCCTGTGGGA 960 

GCTCGGAGGC CTCAGCCTAC CTGGATGAGC TGCGTTTGGC TGTGQCTTGG AACCGCGTGG 1020 

45 ACATTGCCCA GAGTGAACTC TTTCGGGGGG ACATCCAATG GCGGTCCTTC CATCTCGAAG 1080 

CTTCCCTCAT GGACGCCCTG CTGAATGACC GGCCTGAGTT CGTGCGCTTG CTCATTTCCC 1140 

ACGGCCTCAG CCTGGGCCAC TTCCTGACCC CGATGCGCCT GGCCGAACTC TACAGCGOGG 1200 

CGCCCTOCAA CTCGCTCATC CGCAACCTTT TGGACCAGGC GTCCCACAGC GCAGGCACCA 1260 

AAGCCCCAGC CCTAAAAGGG GGAGCTGCGG AGCTCCGGCC CCCTGACGTG GGGCATGTOC 1320 

50 TGAGGATGCT GCTGGGGAAG ATGTGCGCGG CGAGGTACCC CTCCGGGGGC GCCTGGGACC 13 B0 

CTCACCCAGG CCAGGGCTTC GGGGAGAGCA TGTATCTGCT CTCGGACAAG GCCACCTCGC 1440 

CGCTCTCGCT GGATGCTGGC CTCGGGCAGG CCCCCTGGAG CGACCTGCTT CTTTGGGCAC 1500 

TGTTGCTGAA CAGGGCACAG ATGGCCATGT ACTTCTGGGA GATGGGTTCC AATGCAGTTT 1560 

- _ - CCTCAGCTCT TGGGGCCTGT TTGCTGCTCC GGGTGATGGC ACGCCTGGAG CCTGAOGCTG 1620 

J J AGGAGGCAGC ACGGAGGAAA GACCTGGCGT TCAAGTTTGA GGGGATGGGC GTTGACCTCT 1680 

TTGGCGAGTG CTATCGCAGC AGTGAGGTGA GGGCTGCCCG CCTCCTCCTC CGTCGCTGCC 1740 

CGCTCTGGGG GGATGCCACT TGCCTCCAGC TGGCCATGCA AGCTGACGCC CGTGCCTTCT 1800 

TTGCCCAGGA TGGGGTACAG TCTCTGCTGA CACAGAAGTG GTGGGGAGAT ATGGCCAGCA 1860 

CTACACCCAT CTGGGCCCTG GTTCTCGCCT TCTTTTGCCC TCCACTCATC TACACCCGCC 1920 

DO TCATCACCTT CAGGAAATCA GAAGAGGAGC CCACACGGGA GGAGCTAGAG TTTGACATGG 1980 

ATAGTGTCAT TAATGGGGAA GGGCCTGTCG GGACGGCGGA CCCAGCCGAG AAGACGCCGC 2040 

TGGGGGTCCC GCGCCAGTCG GGCCGTCCGG GTTGCTGCGG GGGCCGCTGC GGGGGGCGCC 2100 

GGTGCCTACG CCGCTGGTTC CACTTCTGGG GCGCGCCGGT GACCATCTTC ATGGGCAACG 2160 

TGGTCAGCTA CCTGCTGTTC TTGCTGCTTT TCTCGCGGGT GCTGCTCGTG GATTTCCAGC 2220 

65 CGGCGCCGCC CGGCTCCCTG GAGCTGCTGC TCTATTTCTG GGCTTTCACG CTGCTGTGCG 2280 

AGGAACTGCG CCAGGGCCTG AGCGGAGGCG GGGGCAGCCT CGCCAGCGGG GGCCCCGGGC 2340 

CTGGCCATGC CTCACTGAGC CAGCGCCTGC GCCTCTACCT CGOCGACAGC TGGAACCAGT 2400 

GCGACCTAGT GGCTCTCACC TGCTTCCTCC TGGGCGTGGG CTGCCGGCTG ACCCCGGGTT 2460 

_ A TGTACCACCT GGGCCGCACT GTCCTCTGCA TCGACTTCAT GGTTTTCACG GTGCGGCTGC 2520 

70 TTCACATCTT CACGGTCAAC AAACAGCTCG GGCCCAAGAT CGTCATCGTG AGCAAGATGA 2580 

TGAAGGACGT GTTCTTCTTC CTCTTCTTCC TCGGCGTGTG GCTGGTAGCC TATGGCGTGG 2640 

CCACGGAGGG GCTCCTGAGG CCACGGGACA GTGACTTCCC AAGTATCCTG CGCCGCGTCT 2700 

TCTACCGTCC CTACCTGCAG ATCTTCGGGC AGATTCCCCA GGAGGACATG GACGTGGCCC 2760 

— - TCATGGAGCA CAGCAACTGC TCGTCGGAGC CCGGCTTCTG GGCACACCCT CCTGGGGCCC 2820 

75 AGGCGGGCAC CTGCGTCTOC CAGTATGCCA ACTGGCTGGT GGTGCTGCTC CTCGTCATCT 2880 

TCCTGCTCGT GGCCAACATC CTGCTGGTCA ACTTGCTCAT TGCCATGTTC AGTTACACAT 2940 

TCGGCAAAGT ACAGGGCAAC AGCGATCTCT ACTGGAAGGC GCAGCGTTAC CGCCTCATCC 3000 

GGGAATTCCA CTCTCGGCCC GCGCTGGCCC CGCCCTTTAT CGTCATCTCC CACTTGCGCC 3060 

OA TCCTGCTCAG GCAATTGTGC AGGCGACCCC GGAGCOCCCA GCCGTCCTCC CCGGCCCTCG 3120 

OO AGCATTTCCG GGTTTACCTT TCTAAGGAAG CCGAGCGGAA GCTGCTAACG TGGGAATCGG 3180 

339 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGOAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA GTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGOGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT AGCAGCTCTO CCATGTTGCC CTCAGGTGGG CCGCCACCCC 3420 

TTGACCTGCA TGGGTCCAAA GAGTGAGCCA TGCTGGCGGA TTTTAAGGAG AAGCCCCCAC 3460 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTCG GCCCCCGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCOCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCCGGGCC GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

SEQ ID NO:106 PEU5 Protein sequence 
Protein Accession*: NP_060106 

1 11 21 31 41 51 

I I I I I I 

MASTGGTKW AMGVAPWGW RNRDTLINPK GSPPARYRWR GDPEDGVQFP LDYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIFVL LLLIDGDBKM LTRIENATQA 120 

QLPCLLVAGS GGAADCLAET LEDTLAPGSG GARQGEARDR IRRFFPKGDL EVLQACjVERI 180 

MTRKELLTVY SSEDGSEEPB TIVLKALVKA CGSSEASAYL DELRLAVAWN RVDIAQSELF 240 

RGDIQWRSFH LEASLMDALL NDRPEFVRLL ISHGLSLGHF LTPMRLAQLY SAAPSNSLIR 300 

NLLDQASHSA GTKAPALKGG AAELRPPDVG HVLRMLLGKM CAPRYPSGGA WDPHPGQGFG 360 

ESMYLLSDKA TSPLSLDAGL GQAFWSDLLlr WALLLNRAQM AMYFWEMGSN AVSSALGACL 420 

LLRVMARLEP DAEEAARRKD LAFKFEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDM ASTTPIWALV LAFFCPPLIY TRLITFRKSE 540 

EEPTREELEF DMDSVINGEG PVGTADPAEK TPLGVPRQSG RPGCCGGRCG GRRCLRRWFH 600 

FWGAPVTIFM GNWSYLLFL LLPSRVLLVD PQPAPPGSLE LLLYFWAFTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC FLLGVGCRLT PGLYHLGRTV 720 

LCIDFMVFTV RLLHIFTVNK QLGPKIVIVS KMKKDVFFFL FFLGWLVAY GVATEGLLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VALMEHSNCS SEPGFWAHPP GAQAGTCVSQ 840 

YANWLWLLL VIFLLVANIL LVNLLIAMFS YTFGKVQGNS DLYVJKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLRQLCR RPRSPQPSSP ALEHFRVYLS KEABRKLLTW ESVHKENFLL 960 
ARARDKRESD SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 

SEQ 10 NO:107 PEW3 0NA SEQUENCE 

Nucleic Acid Accession*: NM_005982 

Coding sequence: 276-1 130 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

CGCCGGCCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGQC TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

AGCCGCGCCC CCCTCCCTGC GGCCGCCGCC CCCTGCCTCT CGGCTCTGCT CCCTGCCGCG 240 

TGCGCCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGGCG TGCGTGTGCG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

TACTCAAGGC CAAGGCGGTG GTCGCCTTCC ACCGCGGCAA CTTCCGTGAG CTCTACAAGA 480 

TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

AGGCGCATTA CGTGGAGGCC GAGAAGCTGC GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCOGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TGCGGGAGTG GTACGCGCAC AATCCCTACC 720 

CATCGCCGCG TGAGAAGCGG GAGCTGGCCG AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

GCAACTGGTT TAAGAACCGG AGGCAAAGAG ACCGGGCCGC GGAGGCCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTCC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

GCAAGCCGCT CATGTCCAGC TCAGAAGAGG AATTCTCACC TCCCCAAAGT CCAGACCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

ACTCTCTCCT CGGCCCCCTC ACCTCCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

AATAGAAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGGACTTTC ACAAATATCT TTTTAAAAAT CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GATCAGGG 

SEQ ID NO:1Q8 PEW3 Protein sequence 
Protein Accession #: NP.005973 

1 11 21 31 41 51 

I I I I I I 

MSMLPSFGFT GEQVACVCEV LQQGGNLERL GRFLWSLPAC DHLHKNESVL KAKAWAFHR 60 

GNFRELYKIL ESHQFSPHNH PKLQQLWLKA HYVEAEKlrRG RPLGAVGKYR VRRKFPLPRT 120 

IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTCVSN WFKNRRQRDR 180 

AAEAKERENT ENNNSSSNKQ NQLSPLEGGK PLMSSSEEBF SPPQSPDQNS VLLLQGNMGH 240 
ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 

SEQ ID K0:109 PFJ8 DNA SEQUENCE 

Nucleic Acid Accession*: NM.005069 

Coding sequence: 57-2060 (underlined sequences correspond to start and stop codons) 
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l 11 21 31 41 51 
I I I I I 1 

GGGGCTCCGC GGGCCTGGAG CACGGCCGGG TCTAATATGC CCGGAGCCGA GGCGCGAJG.A 60 
AGGAOAAGTC CAAGAATGCO GCCAAGACCA GGAGGGAGAA GGAAAATGGC GAGTTTTACG 120 
AGCTTGCCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGCCT CACCACGAGC TACCTGAAGA TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCCG AGC CGCGCCGGGC CCCTGGACGG CGTCGCCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG G ATGGATTTG TTTTTGTGGT AGCATCTGAT GGCAAAATCA 360 
TGTATATATC CGAGACCGCT TCTGTCCATT TAGGCTTATC CCAGGTGGAG CTCACGGGCA 420 
ACAGTATTTA TG AATACATC CATCCTTCTG ACCAOG ATG A GATGACCGCT GTOCTCACGG 480 
CCCACCAGCC GCTGCACCAC C ACCTGCTCC A AG AGTATGA GATAGAGAGG TCGTTCTTTC 540 
TTCGAATGAA ATGTGTCTTG GCGAAAAGGA ACGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTG CAGTGGCTAC TTGAAGATCA GGCAGTATAT GCTGGAC ATG TCCCTGTACG 660 
ACTCCTCjCTA CCAGATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA CCCAGTGCCA 720 
TCACCGAGAT CAAGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT GACCTGA AGC 780 
TGATATTCCT GG ATTCCAGG GTGACCGAGG TGACGGGTTA CGAGCCGCAG GACCTGATCG 840 
AGAAGACCCT ATACC ATCAC GTGCACGGCT GCGACGTGTT CC ACCTCCGC TACGCAC ACC 900 
ACCTCCTGTr GGTG AAGGGC CAGGTCACCA CCAAGTACTA CCGGCTGCTG TCCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGCCGC TCGTCCCGGC 1020 
CCCACTGCAT CGTGAGTGTC AATTATGTAC TCACGGAGAT TGAATACAAG GAACTTCAGC 1080 
TGTCCCTGG A GCAGGTGTCC ACTGCCAAGT CCCAGGACTC CTGGAGGACC GCCTTGTCTA 1 140 
CCTCACAAGA AACTAGG AAA TTAGTGAAAC CCAAAAATAC CAAG ATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAG A GCCAGTCCCC CTGCAAGCGC TGCTGCTCCT CCAGAACTGC 1320 
AGCCCCACTC AGAAAGCAGT GACCTTCTGT ACACGCCATC CTACAGCCTG CCCTTCTCCT 1380 
ACCATTACGG ACACTTOCCT CTGGACTCTC ACGTCTTCAG CAGCAAAAAG CCAATGTTGC 1440 
CGGCCAAGTT CGGGCAGCCC CAAGG ATCCC CTTGTGAGGT GGCACGCTTT TTCCTG AGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAAAAATCCT CCAGAGCCAC CGGCGAACAC TGCTAGGCAC AGCCTGGTGC 1620 
CAAGCTACGA AGCGCCCGCC GCCGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTCCCG AG CTGCGGCCAC TACCGCGAGO AGCCCGCGCT GGGCCCGGCC AAAGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCXCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 
TGGCCTGCGC TCCCGGCGGC CCCGAGGCGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1980 
GCCCCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCAA CGGGAGGTQA CCCGCTGGCC GCCCGCGCCA GG AGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCGA GCCCGGCAAA TCCGCACG AC CTACATTAAT TTATGCAGAG 2160 
ACAGCTGTTT GAATTGGACC CCGCCGCCGA CTTGCGGATT TCCACCGCGG AGGCCCCGCG 2220 
CGCCGGTGCC GAGGGCCG AG GAGCGCCCGG GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCGACCC AGTTGCTGGG GGCTTGGTTT CCTCACCTTG AAATCGGGCT 2340 
TCACGCGTCT TGCCTTGTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
TTCACTCCGC AAATATCCTC C ACTTTCAGG AGGGAAAAOC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAOTGG ACGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 
AGGTG AAGGC AGAAGTGATG ATTGT AAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA GAGTCTCATT ATTTTTGTTT TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT GAACATAGGT 2700 
GTGCTTOCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCCCTATGA ACTCTrGATA 2760 
ACACCAAGAG TAGCACCTTC AG AATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 
TAGCCAGACA GTTTATGAGA ATGACCCTGT CAAGCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGCCCACA CAG ATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA CGG AACACTG TCAATGGACT GCACCTTGTG AAGGAAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AGACATTTTA AGCATTTTCT ACACAGCGAG 3060 
AAAACTTCGT AAGAACATGT TACGTGTGCA A CAGGTA AAC AGAAATCCTT TCATAAAGCA 3120 
CCAGCAGTGT TTAAAAAATG AGCTTCCATT AAl 11 11 ACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGGAAAAATC CTGTCATGGC TCTG AACTGC ACAATGCATT GAACCGCCGT 3240 
CCTTCAATTT TCTTCACACT ATC AACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGG AA ACTTTTTCCA CCTTTCTGAA TGGAAAG AGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATC AAGT GCACCTACAC CAACTGCTCT CAAAATGTG A ACTGACTTTT 3420 
TTTTTTTTTT TTTTGCCAAC CCTGTGTCAC TTAGTGAGGA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGG A CCGTGGGTCA 3540 
TGCAGCG AAG GGGCTGGATG GTAGGAAGGG ATGTGCCCGC CTCTCCACGC ACTCAGCTAT 3600 
ACCTCATTCA CAGCTCCTTG TGAGTGTGTO CACAGGAAAT AAGCCGAGGG TATTATTTTT 3660 
TTATGTTCAT GAGTCTTGTA ATTAAACCGT GATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 
GGAG ATACCA CCG AC ATTTT TC AATAAAGT ACTGCAAAAT GCTTTTGTGT CTACCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGG AGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTG AG AAA AAAGACCCTA TCATAGATTT ACAAG 



SEQ ID MQ:110 PFJB Pfoleln sequence: 
Protein Accession #: NP.005060.1 

1 11 21 31 41 51 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SATTSQLDKA SURLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDG VAKEL GSHLLQTLDG FVFWASDGK IMYISETASV HLGLSQVELT 1 20 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYE1ERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 
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KLIFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPHCIVS VNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQO SPCBVARFFL 480 
5 STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRPGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPO APAQLPFVIX 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYIXJAS 660 
VIITNGR 

10 

SEQ ID N0:111 PFJ7 DNA SEQUENCE 

Nucleic Add Accession*: NMJJ06549 
15 Coding sequence: 1-1254 (underlined sequences correspond to start and stop codorts) 

1 U 21 31 41 51 

I I i I I I 
on AIQAACGGAC GCTGCATCTG CCCGTCCCTG CCCTACTCAC CCGTCAGCTC CCCGCAGTCC 60 
20 TCGCCTCGGC TGCCCCGGCG GCCGACAGTO GAGTCTCACC ACGTCTCCAT CACGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGG ATG AAATTGGAAA GGGCTOCTAT 180 
GGTGTCGTCA AGTTGGCCTA CAATGAAAAT GACAATACCT ACTATGCAAT GAAGGTGCTG 240 
TCCAAAAAGA AGCTGATCCG GCAGGCCGGC TTTCCACGTC GCOCTCCACC CCGAGGCACC 300 
CGGCCAGCTC CTGGAGGCTG CATCCAGCCC AGGGGCCCCA TTGAGCAGGT GTACCAGGAA 360 
25 ATTGOCATCC TCAAGAAGCT GGACCACCCC AATGTGGTGA AGCTGGTGGA GGTCCTGGAT 420 
GACCCCAATG AGGACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GCCCGTGATG 480 
GAAGTGCCCA CCCTCAAACC ACTCTCTGAA GACCAGGCCC GTTTCTACTT CCAGGATCTG 540 
ATCAAAGGCA TOGAGTACTT ACACTACCAG AAGATCATCC ACCGTGACAT CAAACCTTCC 600 
AACCTCCTGG TCGGAGAAGA TGGGCACATC AAGATCGCTG ACTTTGGTGT GAGCAATGAA 660 
3U TTC AAGGGC A GTGACGCGCT CCTCTCC AAC ACCGTGGGCA CGCCOGCCTT CATGGCACCC 720 
GAGTCGCTCT CTGAGACCCG CAAGATCTTC TCTGGGAAGG CCTTGGATGT TTGGGCCATG 780 
GGTGTG AC AC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACG A GCGG ATCATG 840 
TGTTTACACA GTAAGATCAA GAGTCAGGCC CTGGAATTTC CAGACCAGCC CGACATAGCT 900 
0 _. GAGGACTTGA AGGACCTGAT CACCCGTATG CTGGACAAGA ACCCCGAGTC GAGGATCGTG 960 
35 GTGCCGGAAA TCAAGCTGCA CCCCTGGGTC ACGAGGCATG GGGCGGAGCC GTTGCCGTOG 1020 
GAGG ATGAG A ACTGCACGCT GGTCGAAGTG ACTGAAGAGG AGGTCGAGAA CTCAGTCAAA 1080 
CACATTCCCA GCTTGGCAAC CGTGATCCTG GTGAAGACCA TGATACGTAA ACGCTCCTTT 1140 
GGGAACCCAT TCGAGGGCAG CCGGCGGGAG GAACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GGAATGTGAG TCCCTGTCTG AGCTCAAGAC QIAfiAAAATA 1260 
40 AGTCCCCTTC CTGCCTGTTG CAAAGTAACG TAAGAGTTCC CTCACCCGAG TGGATGCAGA 1320 
CGTTCTTGCT GTCAGCCACC TTCCTTCATA CACATAGCCA GCCCAGGGTG ACCAGAACGT 1380 
CCCAGGACAG ATGAGGCTTT GTGTCCTTAT GAGAGTGGGA GAACCTGGTG GGCACCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
. _ CCTGACTTGG TGGGAGTTCC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
45 TACAATTCAC ATACCATGTA ATTCACCCAC GGGAAGTGTA TQATTCAGTG GTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACGACAT TTTCATCAGC CCAAGAAGAC 1680 
AOCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGGATTT GCCTATTCTG GACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAA AAA 1800 
AAAA 
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SEQ ID NO:112 PFJ7 Protein sequence: 
Protein Accession #: NP.006540.1 



1 11 21 31 41 51 
I I I I I I 

MNGRCICPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCVQLNQYT LKDEIGKGSY 60 
£ GWKLAYNEN DNTYYAMKVL SKKKURQAG FPRRPPPRGT RPAPGGCIQP RGPffiQVYQB 120 

60 IACLKKLDHP NWKLVEVLD DPNEDHLYMV FELVNQQPVM EVPTLKPLSE DQARFYFQDL 180 
DCGIEYLHYQ KHHRD1KPS NLLVGEDGHI KIADFGVSNE FKGSDALLSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFO QCPFMDERIM CLHSKIKSQA LEFPDQPDIA 300 
EDLKDUTRM LDKNPESRIV VPEIKLHPWV TRHGAEPLPS HDENCTLVEV TEEEVENSVK 360 
HIPSLATVIL VKTMIRKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECE SLSELKT 

SEQ ID N0:113 PFJ6 DNA SEQUENCE 

Nucleic Acid Accession*: NM.021810 
70 Coding sequence: 1-429 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

AXQAAACCTC TGATATGGAC ATGGTCAGAT GTTGAAGGCC AGAGGCCGGC TCTGCTCATC 60 
75 TGCACAGCTG CAGCAGGACC CACGCAGGGA GTTAAGGGTT ATGGCAAGCC CTTTGAGCCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTTACCCAG ATGCCACAAT GCACAGACAA 180 
CTCCTGGCTC CGGTGGAAGG AAGGATGGCA GAGACATTGA ATCAGAAACT CCATGTTGCC 240 
AATGTGCTGG AAG ATG ACCC CGGCTACCTA CCTCACGTCT AC AGCGAGGA AGGGGAGTQT 300 
GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGGAAC AGGAGTTGCA ACCTGATTTG 360 
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CTGOACTCTT TGGGTTCAAA AOCO ACTCCG TTTOAGOAAA TATATTCAGA GTCAGGTGTT 420 
CCTTCCTAA 

SEQ ID NO:114 PFJ6 Protein sequence: 
Protein Accession^: NP.0685B2.1 

1 11 21 31 41 51 
I I I I I I 

MKPLIWTWSD VEGQRPALLI CTAAAGFTQG VKGYGKPFEP RSVKNIHSTP AYPDATMHRQ 60 
LLAPVEGRMA ETLNQKLHVA NVLEDDPGYL PHVYSEEGEC GOAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEEIYSESGV PS 



SEQ ID NO:115 PFJ5 DNA SEQUENCE 

Nucleic Acid Accession #: NM.006361 

Coding sequence: 131-995 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
t I I I I I 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGGAGA GCGAGCTGGG TGCCCCCTAG ATTCCCCGCC CCCGCACCTC ATGAGCCGAC 120 
CCTCGGCTCC ATGGAGCCCG GCA ATTATGC CACCTTGGAT GGAGCCAAGG ATATCGAAGG 180 
CTTGCTGGGA GCGGGAGGGG GGCGGAATCT GGTCGCCCAC TCCCCTCTGA CCAGCCACCC 240 
AGCGGCGCCT ACGCTGATGC CTGCTGTCAA CTATGCCCCC TTGGATCTGC CAGGCTCGGC 300 
GGAGCCGCCA AAGCAATGCC ACCXATGCCC TGGGGTGCCC CAGGGGACGT COCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
GAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 4S0 
GGAAGAGTAC CCCAGTCGOC CCACTGAGTT TGCCTTCTAT CCGGG ATATC CGGGAAOCTA 540 
CCACGCTATG GCCAGTTACC TGGACGTGTC TOTGGTGCAG ACTCTGGGTG CTCCTGGAGA 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGGA CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGA ACAGC CAGATGTGTT GCCAGGG AGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA GACTCCAGCG GGCAGCACCC TCCTGACGCC TGCGCCTTTC GTCGCGGCCG 780 
CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGCCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAACCG CCGGGTCAAA GAGAAGAAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTCCT GGGGAGACCA GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 
AGAGGCCCCT AGAG ACAACA CCCTTCCCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG 1 i40 
CGGCCTGGGT ACCCAGTATG TGCAGGG AGA CGGAACCCCA TGTG ACAGGC CCACTCCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 



SE0 ID NO:116 PFJS Protein sequence: 
Protein Accession #: NP.006352.1 

1 11 21 31 41 51 

MEPGNYATLD GAKDIEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGSAEPP 60 
KQCHPCPG VP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQ A ATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM AS YLDVSV VQ TLGAPGEPRH DSLLPVDSYQ SWALAGGWNS 1 80 
QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
ITKDKRRKIS AATSLSERQI TIWFQNRRVK EKKVLAKVKN SATP 



SEQ ID Nai17 PFJ4 DMA SEQUENCE 

Nucleic Acid Accession #: NMJJ05626 

Coding sequence: 591 -221 6 (underlined sequences correspond to start and stop codons) 
1 II 21 31 41 51 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCCCGGAC CTAAGAGCCT GGGTCCCCTG TTTCCGGAGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGGA 180 
TCCAGGCGTC CGGG ATCTGC GCCACCAGAA CCTAGCCTCC TGCAGACCTC CGCCATCTGG 240 
GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCAGAGAAAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCCGGGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA ATTTAAACAC TCCAGCTTCC 420 
AAGAGCCAAG GAACTTCAGT GCTGTGAACT CACAACTCTA AGGAGCCCTC CAAAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGGAACGTC GGGTCCTGGG AAGGAGCCCA 540 
AGCGCTCCCA GCCAGCTTCC AGGCGCTAAG AAACCCCGGT GCTTCCCATC AXSGTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCGCAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGCCTC CATCG AGGAC CAAGGCGCGG CAGCAGGCGG CTACTGCGGT TCCCGGGACC 720 
AGGTGCGCCG CTGCCTTCGA GCCAACCTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGCGCTTOAG CGCCTTCGTC TTCCCGGGCG AGCTGCTGCT GCGTCTGCTO CGGATGATCA 900 
JCTTGCCGCT GOTGGTGTGC AGCTTGATCG GCGGCGCCGC CAGCCTGGAC CCCGGCGCGC 960 
TCGGCCGTCT GGGCGCCTGG GCGCTGCTCT TTTTCCTGGT CACCACGCTG CTGGCGTCGG 1020 
CGCTCGGAGT GGGCTTGGCG CTGGCTCTGC AGCCGGGCGC CGCCTCCGCC GCCATCAACG 1080 
CCTCCGTGGG AGCCGCGGGC AGTGCCQAAA ATGCCCCCAG CAAGGAGGTG CTCGATTCGT 1140 
TCCTGGATCT TGCOAGAAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACXAC CTATGAAG AG AGGAATATCA CCGGAACCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGGA GGGGATGAAC ATCCTGGGCT TGGTAGTGTT TGCCATOGTC TTTGGTGTGG 1320 
CGCTGCGGAA GCTGGGGCCT GAAGGGGAGC TGCTTATCCG CTTCTTCAAC TCCTTCAATG 1380 
AGGCCACCAT GGTTCTGGTC TCCTGGATCA TGTGGTACGC CCCTGTGGGC ATCATGTTCC 1440 
TGOTGGCTGG CAAGATCGTG GAOATGGAGG ATGTGGGTTT ACTCTTTGCC CGCCTTGOCA 1500 
AGTACATTCT GTGCTGCCTG CTGGGTCACX3 CCATCCATGG GCTCCTGGTA CTGCCCCTCA 1560 
TCTACTTCCT CTTCACCCGC A AAAACCCCT ACCGCTTCCT GTGGGGCATC GTGACGCCGC 1620 
TGGCCACTGC CTTTGGGACC TCTTCCAGTT CCGCCACGCT GCCGCTGATG ATGAAGTGCG 1680 
TGGAGGAG AA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GGACGGTGCC GCGCTCTTCC AGTGCGTGGC CGCAGTOTTC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAAAGA TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA CCTCCCGGTC GACCATATCT CCTTGATCCT GGCTGTGGAC TGGCTAGTCG 1980 
ACCGGTCCTG TACCGTCCTC AATGTAQAAG GTGACGCTCTGGGGGCAGGA CTCCTOCAAA 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCCCCT GGATCCGCTG CCAGTCCCCA CTGAGGAAGG AAACCCCCTC CTCAAACACT 2160 
ATCGGGGGCC CGCAGGGGAT GCCACGGTCG CCTCTGAGAA GGAATCAGTC ATGTA AACCC 2220 
CGGGAGGGAC CTTCCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATGAGGAATG 2280 
G ATAAATOGA TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCACCCT CCAGGACAGO AGATCTGOOA TGCCTGGCTG CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACCC CCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAGAAAACA 2460 
GCAAGATGGA GAAATAATGT TCTGCTGCGT CCCCACCGTG ACCTGCCTGG CCTCCCCTGT 2520 
CTCAGGGAGC AGGTCACAGG TCACCATGGG GAATTCTAGC CCCCACTGGG GGGATGTTAC 2580 
AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTGTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTGACC TCCTGTCCCC ATGGTACGTC 2700 
CCACCCTGTC CCCAGATCCC CTATTCCCTC CACAATAACA GAAACACTCC CAGGGACTCT 2760 
GGGG AGAGGC TGAGGACAAA TACCTGCTGT CACTCCAGAG GACATnTTT TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEQ ID N0:118 PFJ4 Protein sequence: 
Protein Accession #: NP_005619.1 

1 11 21 31 41 51 
I I I I I I 

MVADPPRDSK GLAAAEPTAN GCLALASED QGAAAGGYCG SRDQVRRCLR ANLLVLLTW 60 
AWAGVALGL GVSGAGGALA LGPERLS AFV FPGELLLRLL RMIILPLWC SUGG AASLD 120 
PGALGRLGAW ALLFFLVTTL LASALGVGLA LALQPGAASA AINASVGAAG SAENAPSKEV 180 
LDSFLDLARN IFPSNLVSAA FRSYSTTYEE RNITGTRVKV PVGQBVEGMN ILGLVVFATV 240 
FGVALRKLGP EGELURFFN SFNEATMVLV SWIMWYAPVG IMFLVAGKIV EMEDVGXXFA 300 
RLGKYILCCL LGHAIHGLLV LPLIYFLFTR KNPYRFLWGI VTPLATAFGT SSSSATLPLM 360 
MKC VEENNG V AKHISRFILP IGATVNMDGA ALFQCVAAVF 1AQLSQQSLD FVKHTILVT 420 
ATASSVGAAG IPAGGVLTLA IILEAVNLPV DHISULAVD WLVDRSCTVL NVBGD ALG AG 480 
LLQNYVDRTE SRSTEPEUQ VKSELPLDPL PVPTEEGNPL LKHYRGPAGD ATVASEKESV 540 
M 



SEQ ID N0:119 PFJ3 DNA SEQUENCE 

Nucleic Add Accession* KM_006708 

Coding sequence; 88-642 (underlined sequences correspond to start and slop codons) 
1 U 21 31 41 51 

CTAGTTAAGG CGGCACAGGG CCGAGGCGTA GTGTGGGTGA CTCCTCCGTT CCTTGGGTCC 60 
CGTCGTCTGT GATACTGCAG TTCAGCCAIQ GCAOAACCGC AGCCCCCOTC CGGCGGCCTC 120 
ACGGACGAGG CCGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTGCAGCAGA CCATGCTACG AGTGAAGGAT CCTAAGAAGT CACTGGATTT TTATACTAGA 240 
GTTCTTGGAA TGACGCTAAT CCAAAAATGT GATTTTCCCA TTATG AAGTT TTCACTCTAC 300 
TTCTTGGCTT ATGAGG ATAA AAATGACATC CCTAAAGAAA AAGATGAAAA AATAGCCTGG 360 
GCGCTCTCCA G AAAAGCTAC ACTTG AGCTG ACACACAATT GGGGCACTGA AGATGATGCG 420 
ACCCAGAGTT ACCACAATGG CAATTCAGAC CCTCGAGG AT TCGGTCATAT TGGAATTGCT 480 
GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTGAAGAAC TGGG AGTCAA ATTTGTGAAG 540 
AAACCTG ATG ATGGTAAAAT G AAAGGCCTG GCATTTATTC AAG ATCCTGA TGGCTACTGG 600 
ATTG AAATTT TGAATCCTAA CA AAATGGCA ACCTTAAT GT AGT GCTGTGA G AATTCTCCT 660 
TTG AGATTTC AGAAGAAAGG AAACAATGTG ATTC AAG ATA TTTACATACC AGAAGCATCT 720 
AGGACTG ATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTCC CCTTCCTATT 780 
TCAGCTGTTC CTTTTCACCT AACTGTTC AG TCATTCTGGT TTTCAAGCAG TGCTTTATCT 840 
CATGTCCTTG AATATAGTTG TGTAACTTTA TTTTTTAGGT AATAATTAGA ACAGTTCCCT 900 
TCAGAGGCTG CATTTGCCTT CTTCTGCCAC CTAAATATTA CTTCCCTTCA AATCTGCCTT 960 
TOAATCATCA TTTTTAAAAA AAAATTAAC A TGTTTTTGTT GTAGTTATCT TCTGGGGTTT 1020 
CAATTCCTC A G AAACAACTT TTTTCACAAC GGAAAGG AAA G AACACTAGT GTTCTTTCAG 1080 
TAAAGTACAA AGTGTTTATT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1 140 
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GCTGACAAOG ATACTGATAG AAAAAGTGAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGGACTAA CCTTATTTAT TTGGGAAAGG GGAGGAGGAA GGAAATG ATA TGGTACCCAG 1260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTCAT GTG AG AAAG A 1320 
AAGCAGGCTA GGCATGTG AA ATCACTTTCA TGG ATTATTA ATGGATTTAA GAGGGCATCA 1380 
ATCAGCTCAA CTCAAGATTT CATAATCATT TTTAGTATTT AGATTGTGCC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGGTTTCCT GTTOTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
TGTGCTCTTG GCTCTTGGGC TGGAGTACCG TGGTGAGGGA GTAAACACTAGAAGTCTTTA 1560 
GTACAAAACT GCTCTAGGGA CACCTGGTGA TTCCTACACA AGTOATGTTT ATATTTCTCA 1620 
TAAAG AGTCT TCCCTATCCC AAGGTCTTCA TG ATGCCAGT AGCCATATAT GATAAATTAT 1680 
GTTCAGTGAT AACTTAGTTA TCAOAAATCA GCTCAGTGGT CTTCCCCGCC ATOATTCACA 1740 
TTTGATG AGT TTTTAAAAAT CAAAGTGATT TTG AAAATCT CTAATGGCTC AGAAAATAAA 1800 
AACATCCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAO ACTCTAGTGG AAGACCTTTG 1860 
GAAAGGCCAT GCCAACCGTG CTTGTACTGC TAGAAGCACT TTATGTTTCC TTTTTGGGTG 1920 
AAATGGATTT ATGTOAGTGC TTTAAACAAA TAGCAATACT TATAG ACTGA AATAAAATGA 1980 
AACTTCAAAT AAG 



SEQ ID NO:120FFJ3 Protein sequencer 
Protein Accession #: NP.006699.1 

1 11 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC 5DADPSTKDF LLQQTMLRVK DPKKSLDFYT RVLGMTUQK 60 
CDFPDvtKFSL YFLAYEDKND IPKEKDEKIA WALSRKATLE LTHNWGTCDD ATQS YHNGNS 120 
DPRGFGHIGI AVPDVYSACK RFEELG VKFV KKPDDGKMKG LAFIQDPDGY WIEILNPNKM 180 
ATLM 



SEQ tD N&121 PFJ2 DMA SEQUENCE 

Nucleic Acid Accession* NM_002867 

Coding sequence: 70-729 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I t 

CCGACGCCAG GTCCTGCCGT CCCGCCGACC GTCCGGGAGC GAACCCGTCG TCCCGCACTG 60 
GA GTCC CKX3 AJQSG CTTC A GT GACAG ATGGT AAACATGG AG TCAAAGATGC CTCTGACCAG 120 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTGG CAAGACCTCC 180 
TTCCTCTTGC GCTATGCTOA TGACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
GACTTCAAGG TG AAG AC AGT CTACCGTCAC GAGAAGCGGG TGAAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTGA TGTATG ACAT CACCAATGAA GAGTCCTTCA ATGCTGTCCA AGACTGGGCT 420 
ACTCAGATCA AGACCTACTC CTGGGACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
GACATGGAGG AAOA GAGGGT TGTTCCCACT GAGAAGGGCC AGCTCCTTGC AGAGCAGCTT 540 
GGGTTTGATT TCTTTG AAGC CAGTGCAAAG G AGAAC ATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGOTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAGA CCCGTCGATG 660 
CTGGGCTCCT CGAAGAACAC GCGTCTCTCG GACACOCCAC CGCTGCTGCA GCAGAACTGC 720 
TCATGCIA2C AAGGCCCACC TTCCTG ACCT CCCCTCATTG TGGCCCCACA CCCAAGTCTG 780 
CTTCTCCCTG TTACACACTG TCCGCTCT 



SEQ ID NO:122PFj 2 Pro ^n sgqyence ; 
Protein Accession #: NP..002858.1 

1 11 21 31 41 51 
I 1 I I I I 

MASVTDGKHG VKDASDQNFD YMFKLLUGN SSVGKTSFLL RY ADDTFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTTTTA YYRGAMGFIL MYDITNEESF NAVQDWATQI 120 
KTYSWDNAQV ILVGNKCDME EERWPTEKG QLLAEQLGFD FFEASAKENI SVRQAFERLV 180 
DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PLLQQNCSC 



SEQIDNO:123PFJ1 DNA SEQUENCE 

Nucleic Acid Accession*: NM.001844 

Coding sequence: 158-4621 {underftoed sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I i I I I I 

ACGCAG AGCG CTGCTGGGCT GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTCC 60 
TGCATGAGGG CGCGGTAGAG ACCCGGACCC GCGCCGTGCT CCTGCCGTTT CGCTGCGCTC 120 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCCAXfl ATTCGCCTOG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGACGC TGCTCGTCGC CGCTGTCCTT CGGTGTCAGG GCCAGGATGT 240 
CCAGGAGGCT GGCAGCTGTG TGCAGGATGG GCAG AGGTAT AATG ATAAGG ATGTGTGGAA 300 
GCCGGAGCCC TGCCGG ATCT GTGTCTGTG A C ACTGGG ACT GTCCTCTGCG ACGAC ATAAT 360 
CTGTG AAG AC GTGAAAG ACT GCCTCAGCCC TGAG ATCCCC TTCGG AGAGT GCTGCCCCAT 420 
CTGOCCAACT GACCTCGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACCTGGAGAC ATCAAGGATA TTGTAGGACC CAAAGGACCT CCTGGGCCTC AGGGACCTGC 540 
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AGGGGAACAA GGACCCAGAG GGGATCGTGG TGACAAAGGT GAAAAAGGTG CCCCTGGACC 600 
TCGTGGCAG A GATGGAGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CCCCCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGGATTTGA 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGGACC 780 
3 TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTGAAGCTGG AAAACCTGGA AAAGCTGGTG AAAGGGGTCC 960 
GCCTGGTCCT CAGGGTGCTC GTGGTTTCCC AGGAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 

, _ TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 

10 GGGTGAGAGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGCCT 1140 
GCCTGGTG AA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTGAAGCCGG CCCCACTGGT GCCCOTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTCC TGGGTCCCCT GGGCCTGCTG GTGCCTCCGG 1380 

1 5 TAACCCTGGA ACAGATGGAA TTCCTGG AGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1560 
CCCCAAGGGA GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTGAAGA 1620 
AGGCAAGAGA GGTGCCCGTG G AG AGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 

20 AAGAGGTGCT CCCGGAAACC GCGGTTTCCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGGAGCCCCT GGTGAAGATG GTCGTCCTGG 1920 
AOCTCCAGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1980 

25 AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT GAACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGCGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGCCTCG TGGGTCCCAG GGGTGAACGA GGTTTCCCAG GTGAACGTGG 2280 

30 CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCCCCCTGGC GCACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCG ACAGGGG 2460 
TGACGTTGGT G AGAAAGGCC CTG AGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTG AC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 

3 5 TCCTGGTCCT GCAGG AAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG GAGAG ACTGG 2640 
CCCCCCCGGA CCAGCGGGAT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAG AGGCCG GCCAGAAAGG CGATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGC A CCTGGGCCTC AGGGTCCTAC TGGAGTG ACT GGTCCTAAAG G AGCCCGAGG 2820 

A _ TGCCCAAGGC CCCCCGGGAG CC ACTGG ATT CCCTGGAGCT GCTGGCCGCG TTGGACCCCC 2880 

40 AGGCTCCAAT GGCA ACCCTG G ACCCCCTGG TCCCCCTGGT CCTTCTGG AA AAG ATGGTCC 2940 
CAAAGGTGCT CGAGGAG ACA GCGGCCCCCC TGGCCGAGCT GGTGA ACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGGAGA GCCTGGAGAT GACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
ACGTGGTG AG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTGAGCCCG GCAAGCAGGG 3 180 

45 TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTG AACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTGATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGGAGAAGCT GGTGCACAAG GCCCCATGGG ACCCTCAGGA CCAGCTGGAG CCCGGGGAAT 3480 

50 CCAGGGTCCT CAAGGCCCCA GAGGTGACAA AGGAGAGGCT GGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGG A CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAGACCAA GGTGCTTCTG GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGGAATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGGA CGATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 

5 5 TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 

GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT G ACGCCGAGG TGGATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 

60 GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 
AGCAAACGTT CCC AAG AAG A ACTGGTGGAG CAGCAAGAGC AAGGAGAAG A AAC ACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
- _ CATCACCTAC CACTGCAAG A ACAGCATTGC CTATCTGG AC GAAGCAGCTG GCAACCTCAA 4380 

65 GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAGATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGA AACAT ACCGGTAAGT GGGGCAAG AC 4500 
TGTTATCGAG TACCGGTCAC AG AAG ACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GGAC ATAGG A GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGIA 4620 
AAAACCTGAA CCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 

70 AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTC ATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTCTCCC CTCTCTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT AAAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CA AGGCAGAG GCAGGAAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCCA GAAG ACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 

75 ATGGTGCTAT TCTGTGTCAA ACACCTCTGT ATI 11 IT AAA ACATCAATTG ATATTAAAAA 5040 
TGAAAAGATT ATTGGAAAGT 



SEQ ID NO:124 PFJ1 Protein sequence: 
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Protein Accession f : NP.001835.2 

1 II 21 31 41 51 
<r ! I I I I I 

D MIRLOAPQSL VLLTtXVAAV LRCQGQDVQE AGSCVQDGQR YNDKDVWKPB PCRICVCDTG 60 
TVLCDDUCE DVKDCLSPEI PFGECCPICP TDLATASGQP GPKGQKGEPG DIKDIVGPKG 120 
PPGPQGPAGE QGPRGDRGDK GEKGAPGPRG RDGEPOTPGN PGPPGPPGPP GPPGLGGNFA 180 
AQMAGGFDEK AGG AQLGVMQ GPMGPMGPRG PPGPAGAPGP QGFQGNPGEP GEPGVSGPMG 240 
1A PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDOAKG 300 
10 EAGAPG VKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAKG 420 
SAGAPGIAGA PGFPGPRGPP GPQGATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 480 
APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DG1AGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQOARGQPG 600 
15 VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 660 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP GAQGPPGLQG MPGERGAAGI AGPKGDRGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAG AN GEKGEVGPPG PAGSAGARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPG AKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKG ARGAQ GPPGATGFPG 900 
20 AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAG VKGDRGE TGAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1 140 
_ _ LPGPPOPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI POPIGPPGPR GRSGETGPAG 1200 
25 PPGNPGPPGP PGPPGPGIDM SAFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQIESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PUDIAPMDl GGPEQEFGVD IGPVCFL 



SEQ ID N0.125 PFH9 DNA SEQUENCE 

Nucleic Acid Accession*: NM.005084 
35 Coding sequence: 162-14S7(undenlned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCTGGTCGG A GGCTCGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
40 GCGTTGGTGC GCGGTGGAAC GCGCCCAGGG ACCCCAGTTC CCGCGAGCAG CTCCGCGCCG 120 
CGCCTGAGAG ACTAAGCTGA AACTGCTGCT CAGCTCCCAA GATQGTGCCA CCCAAATTGC 180 
ATGTGCTTTT CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTG AC TGGCAATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGGAAA TGGGCCTTAT TCCGTTGGTT 360 
45 GTACAGACTT AATGTTTGAT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATCCAT 420 
CCCAAGATAA TGATCGCCTT G ACACCCTTT GGATCCCAAA TAAAG AATAT TTTTGGGGTC 480 
TTAGCAAATT TCTTGGAACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAAC TCCTGCAAAC TGGAATTCCC CTCTGAGGCC TGGTGAAAAA TATCCACTTG 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GG ACACTTTA TTCTGCTATT GGCATTG ACC 660 
50 TGGCATCTCA TGGGTTTATA GTTGCTGCTG TAGAACACAG AGATAGATCT GCATCTGCAA 720 
CTTACTATTT CAAGG ACCAA TCTGCTGCAG AAATAGGGGA CAAGTdTGG CTCTACCTTA 780 
GAACCCTGAA ACAAGAGGAG GAGACACATA TACGAAATGA GCAGGTACGG CAAAGAGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
55 TAGCAGTAAT TGGACATTCT TTTGGTGGAG CAACGGTTAT TCAGACTCTT AGTGAAGATC 1020 
AG AGATTCAG ATGTGGTATT GCCCTGGATG CATGG ATGTT TCCACTGGGT G ATGAAGTAT 1080 
ATTCCAGAAT TCCTCAGCCC CTCTTTTTTA TCAACTCTGA ATATTTCCAA TATCCTGCTA 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CTGATAAAGA AAGAAAGATG ATTACAATCA 1200 
- A GGGGTTCAGT CCACCAGAAT TTTGCTGACT TCACTTTTGC AACTGGCAAA ATAATTGGAC 1260 
60 ACATGCTCAA ATTAAAGGGA GACATAGATT CAAATGTAGC TATTGATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGGAGATGAT GAGAATCTTA TTCCAGGQAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGG AA TAGAGAAATA CAATIAfiGAT TAAAATAGGT 1500 

65 
70 



SEO ID WO:126 PFH9 Protetn sequence: 
Protein Accession*: NP_005075.1 



1 11 21 31 41 51 

MVFPKLH VLF CLCGCLAV VY PFDWQYINPV AHMKSSAWVN KIQVLMAAAS FGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 120 
«_ LRLLFGSMTT PANWNSPLRP GEKYPLWFS HGLGAFRTLY SAIGIDLASH GFTVAAVEHR 180 
75 DRSASATYYF KDQSAAEIGD KSWLYLRTIJC QEEETHIRNE QVRQRAKECS QALSULDID 240 
HGKPVKNALD LKFDMEQLKD SIDREKIAVI GHSFGGATV1 QTLSEDQRFR CGIALDAWMF 300 
PUGDEVYSR1 PQPLFFINSE YFQYPAN11K MKKCYSPDKE RKMmRGSV HQNFADPTFA 360 
TGKIIGHMLK UCGDIDSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENUPGT 420 
NINTTNQH1M LQNSSGIEKY N 
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SEQ ID NO:127 PFH8 DNA SEQUENCE 

Nucleic Acid Accession!: NM.015900 

Coding sequence: 32-1402 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 SI 
I t I I I I 

CACG AGCGGC ACGAGGATTT CCAGCTCAGC GAIfiCCCCCA GGTCCCTGGG AOAGCTGCTT 60 
CTGGGTGGGG GGCCTCATTT TOTGGCTCAG CGTTGGAAGT TCAGGGGATG CACXTCCTAC 120 
CCCACAGCCA AAGTGCGCTG ACTTCCAGAG CGCCAACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTTTG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAGAAGGAAG 240 
CAGTGACCTC CAAAACTCTO GGTTCAATGC CACTCTGGGA ACCAAACTAA TTATCCATGG 300 
ATTCAGGGTT TTAGGA ACAA AGCCTTCCTG GATTGACACA TTTATTAGAA CCCTTCTGCG 360 
TGCAACGAAT GCTAATGTG A TTGCCGTGGA CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTG AG CCTCGAGATC TCCCTTTTCC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTCGG AATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCCCA 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTCGG AGGCCAGCTG GGACAGATCA CAGGCCTGGA 600 
CCCCGCTGGA CCTGAGTACA CCAGGGCCAG TGTGGAAGAG CGCTTGGATG CTGGAGATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAGACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGGAC TACTTCGTCA ACGGAGGCCA AGACCAACCT GGCTGCOCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTGATCA CATGAGGGCT GTGCACCTCT ACATCAGCGC 840 
CCTGOAGAAT TCCTGTCCAC TGATGGCCTT TCCCTGTGCC AGCTACAAGG CCTTCCTTGC 900 
TGGACGCTGT CTGG ATTGCT TTAACCCTTT TCTGCTTTCC TGCCCAAGGA TAGGACTGGT 960 
GGAACAAGOT GGTGTCAAGA TAGAGCCGCT CCCCAAGGAA GTGAAAGTCT ACCTCCTGAC 1020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TGAAGGAACT 1080 
GAGAAACAAG GACACCAACA TCGAGGTTAC CTTCCTTAGC AGTAACATCA CCTCTTCATC 1140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGGA ATCATAGCCC ATGOCACCCC 1200 
ACAATGCCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGACCGGACT ACCATTATTG GGAAGTTCTG CACTGCCCTT TTGCCTGTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGATA GCCTGTGTGJLAfiTTTAACCT GGGCAGGACA CATCTCCCTG CA IMI 1111 1440 
il l 1 1 1 1 - 1 11 GAGAGAGAGG TGTGATGAGG GATGTGTGTG TGCAGCTTAT TGTAGACCAT 1500 
TACTACTAAG GAGAAAAGCA AAGCTCTTTC TTATTTTCCT CATAATCAGC TACCCTGGAG 1560 
GGGAGGGAGA ACTCATTTTA CAGAACTTGG TTTCCTTTGC CGATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTCCTTGGGC ATTCGTACTT 1680 
AGGATTCAAT AGAAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



SEQ ID 110:128 PFH8 Protein sequence: 
Protein Accession #: NP_056984.1 

1 11 21 31 41 51 
I I I I I I 

MPPGPWESCF WVGGULWLS VGSSGDAPPT PQPKCADPQS ANLFEGTDLK VQFLLFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TLGTKLQHG FRVLGTKPSW IDTFIRTLLR ATNANVIAVD 120 
WIYGSTGVYF S AVKNVIKLS LEISLFLNKL LVLGVSESSI HHGVSLGAH VGGMVGQLFG 180 
GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEAIHTDT DNLGIRIPVG HVDYFVNGGQ 240 
DQPGCPTFFY AGYSYUCDH MRAVHLYISA LENSCPLMAF PCAS YKAFLA GRCLDCFNPF 300 
LLSCPRIGLV EQGGVKIEPL PKEVKVYLLT TSSAPYCMHH SLVEFHLKEL RNKDTNTEVT 360 
FLSSNITSSS KITIPKQQRY GKGHAHATP QCQINQVKFK FQSSNRVWKK DRTTIIGKFC 420 
TALLPVNDRE KMVCLPEPVN LQASVTVSCD LKIACV 



SEQ ID N0:129 PFH7 DNA SEQUENCE 

Nucleic Acid Accession #; NM.014384 

Coding sequence: 89-1336 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

CGTTGCCGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCG AAGG CGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CGGCGGC TAT G CTGTGG AGC GGCTGCCGGC GTTTCGGGGC 120 
GOGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTCCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GACCTCCTGC ATCGACCCTT CCATGGGACT TAATG AAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC G AGAGATGGC TCCAAATATG GCAGAGTGGG ACCAGAAGGA 300 
GCTGTTCCCA OTGOATGTG A TGCGGAAGGC AGCCCAGCTA GGCTTCGQA G GGGT CTACAT 360 
ACAAACAGAT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATCCA CAACA TGTGTGCCTG 480 
GATG ATTGAT AGCTTCGGAA ATG AGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATGGAGAAG TTTGCTTCCT ACTGCCTCAC TGAACCAGGA AGTGGGAGTG ATGCTGCCTC 600 
TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTGAGT CAGACATCTA TGTGGTCATG TGCCGAACAG GAGGACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTG A G AAGGGGACC CCTGGCCTCA GCTTTGGCAA 780 
GAAGGAGAAA AAGGTGGGGT GGAACTCCCA GCCAACACGA GCTGTGATCT TCGAAGACTG 840 
TGCTOTCCCT GTGGCCAACA G AATTGGG AG CG AGGGGCAG GGCTTCCTCA TTGCCGTG AG 900 
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AGCACTGAAC GGAGGQAGOA TCAATATTGC TTCCTGCTCC CTGGGGGCTG CCCACGCCTC 960 
TGTCATCCTC ACGCGAGACC ACCTCAATGT CCGGAAGCAG TTTGGAGAGC CTCTGGCCAG 1020 
TAACCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGOGGCT 1080 
GATGGTCCGC AATGCAGCAG TGOCTCTGCA GOAGOAGAGG AAGGATGCAG TGGCCTTGTG 1 140 
CTCCATGGCC AAGCTCTTTG CTACAGATG A ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCACGGG GGCTACGGCT ACCTGAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAGGGTCCAC CAGATTCTAG AAGGTAGCAA TGAAGTGATG AGGATACTQA TCTCTAGAAG 1320 
CCTGCTTCAG GAGTACjAACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTG AGCTCC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCACCAGCAT CGGGTCTTGG 1500 
ACTGGGGCAG AATCCCCAGT GGAACCGGAA GAGCTGOACT GATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC CTTGTTTTCC TAATGCCAGA AGGGTGACCA GTGAAGATTC ACCGTCAAAC 1620 
CATGAAAGTC CTTTCTTGGA TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1680 
GGATCCCTCC TCTAGGGGCC TGGGGACTTT CACTGATGCT CTTCCTGATT CTAGAGCAAA 1740 
GGTGTGGGAA GGGGAAATGG AGGAATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TACAGATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT G ATAAAATGG 1860 
ATATTTGGAA ACTTACTCCT AAGCTGTGAT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAG ACT TTTGAATGTT GAATATTCGT TGGGTTTCAT GTTAAGACGC 1980 
CTGTGGTCC A GG AGTGCTAT TCAGTGTTTC TGTTCCTG AT AAACACTTTG AATATTTTTT 2040 
TGTGTTTTTG TTTC CIU ' I C TGAAGCTGTT CCTC C1 1 1 1 A AATA 1 1 1 1 1 A ATCACATTGA 2100 
TAAAATCTAT CCTTCATCCA (XTCTGGTTC TACTATAGTT G AM 111 ATT TTAAATGTTT 2160 
AA TTGT ATTT GATTAAACAC TTAACTGGAT TTTGG AATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTTAAAA AAAAA AAA 



SEQ ID N0:130 PFH7 Protein sequenca: 
Protein Accession #: NP_0551 99.1 



1 II 21 31 41 51 
I I I I I I 

MLWSGCRRPG ARLGCLPGGL RVLVQTGHRS LTSODPSMG LNEEQKEFQK VAFDFAAREM 60 
APNMAEWDQK ELFPVDVMRK AAQLGFGGVY IQTDVGGSGL SRLDTSVIFE ALATGCTSTT 120 
AYISIHNMCA WMIDSFGNEE QRHKFCPPLC TMEKFAS YCL TEPGSGSDAA SLLTSAKKQG 1 80 
DHYILNGSKA FISGAGESDI YVVMCRTGGP GPKGISCIW EKGTPGLSFG KKEKKVGWNS 240 
QPTRAVIFED CAVPVANRIG SEGQGFUAV RGLNGGRIN1 ASCSLGAAHA S VILTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLVAAR LMVRNAAVAL QEERKDAVAL CSMAKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SR VHQILEGS NEVMREUSR SLLQE 



SEQ ID N0:131 PFH8 DNA SEQUENCE 

Nucleic Acid Accession* NMJH3989 

Coding sequence: 707-1 1 05(undertlned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCCTGCAGAG AGAGGCACTT TGCACCACAG ACAGATAGCA AGAAGGGAAA GACAGAGAGT 60 
G AG AAAAAAG AGG AGTCAGT CGCTCCTGGG GAAGGG AGAG AGTG AGACTG GG AG AAAG AG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAG AAAGA AACAGGCTAC GTTTAAAGAG 240 
CATAGAGACA ATGAAAGGCT AAAGAAAATT TTAAAATCTC TGOCACAGTC TCATAGGTGC 300 
TTGGAAATG A AAGTA GAACT GCCTGTCTTT AACGGACTCT GACAGAGGTA ACTGGATTAG 360 
GGACGAGTAC GCC AGCTTTT 1 11 l lTl l lT TTTTTTTTTT TTTAACATCT TAAATCCTG A 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TGAATGAATT GATGGGCACA 480 
CrCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCTT CTTTG AAAGA 540 
GGAGACAACT TGGGCTTCCT TTTAATTTAG TTTTTTTTCC CCTTCTCCCC CAACCCCCAA 600 
CCTTCCCCCT TACCTCCCCC ACCCCCTTTA TCACCACCCC CCTTTTAAAT AAGAGGGTGA 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCAGGAGGCA GAGAAGATGG GCATCCTCAG 720 
CGTAGACTTG CTGATCACAC TGCAAATTCT GGCAGTTTTT TTCTCCAACT GCCTCTTCCT 780 
GGCTCTCTAT GACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTGA GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAGAG GGACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTCGATGCCT ACAAACAGGT GAAATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAGAAGGAGG TGACAACAGT GGCAATGGTA CCCAGGAGAA 1020 
GATAGCTGAG GG AGCCACAT GCCACCTTCT TGACTTTGCC AGCCCTG AGC GCCCACTAGT 1080 
GGTCAACTTT GGCTC AGCCA CTTGACCTCC TTTCACGAGC CAGCTGCCAG CCTTCCGCAA 1 1 40 
ACTGGTGGAA GAGTTCTCCT CAOTGGCTGA CTTCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCGA TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAGAAGCA 1260 
CCAGAACCAG GAAG ATCG AT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1 320 
GCCCCAGTGC CGAGTTGTGG CTGACCGCAT GGACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAACCTTC AAGAAGTCCG GCATTGGCTG GAGAAGAATT TCAGCAAGAG 1500 
ATGAAAGAAA ACTAG ATTAG CTGGTTAAAG GTATGATTAT AAG AGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT AAAGGCAAGG AAATTAAGAA CTGAATCCAT ATTTCAACAG AGCCCTATTG 1620 
GCTTACTGAA AG ACAGGAGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGG CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAGA TGGAGAGGAA OAAACGCTAA TTCAGCATGT 1800 
GTTCATTCTG CATTGAGAAG GAACTGATAC ATCTGATGCA TGCTTTGAOA CCAGAAGAAA 1860 
AO ACTTACCT GAATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCTTGGC TCTATTTGGC ATGGATGGAG OCCAGTTGGA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTGA ACCCAGGCCA TGTGGTTAGA CGTTGGTGTT AAGGTTAGAC CTTATGTTAG 2040 
AGTCATTTCT GATOTTCCAG CTTCTAGCCA TGTAGTGCTC TCAGTCTTCA TACCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TACCGAGAAT GATCCCTCAG TCTGAGAGGT TAGAATGATC 2160 
5 ATCTGTAATC TGAGGGTTAA TTTCTAGGCA GGTGGAGAGA GTGGTAAAAA AGAAATG AAA 2220 
TTGACAAGCT AGG AAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAA AGAAG AAGGAGCTCA ACTAAAAGTG GCATAG AGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGOAGAAAG GGGTGATTGA AAGAAAAAAA AATACTTAAA 2460 

10 TATTTGTAAT TGTG AGGGGT TTCTTTTGGA AATAATTACT TTTG AACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATGATACCTA TTAAAGGAAA ACCAGTGGGT 2580 
CTGGTGGTGC TGGTCTTTTC CTCCCCATTC CTACAATTTC TATGTGGCCC AAGTCATTCC 2640 
TAATCTTGGT CTCTATAGCA GTGTTCTCTC TGAATGCTGA GCTGAAGAAA TTATACGTAC 2700 
c ATACACACAT ACATACATAC ATACA AATAT ATGTATATAT ATTCTCAGCT GCTGCGGGAG 2760 

1 D GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT G AGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAGAAG AA GAGGAAGTTA 2880 
GAG ATAATTC TGATCAATCA TTTTGGAGGC TTTGTTATAA GGCAACCCCC GGTATATCAT 2940 
GG AATTTCCA TTG ACATTTG AATTTGG ACT TOO ATCTTCC CTTGGTCCCA TTAGCTG AGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 

ZO TAAAATATTT J 1 1 1CI 11 IT AAAATAG AC A CTATAGTTTT ACCCATA AGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAGAA TGGACCAACC ACTTTCGTAT CATAATTTCT TTTTGGTAAA 3180 
TATGAG ACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAA A AATAAATAAA ATGGATAG AA AAAAACTAAA 3300 
_ GTTGAAAATA CATTCTTAAA CTAGTTGTCT GAAATGAG AA AAG AGTGAQA ACTAGGTGTG 3360 

25 CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGGAG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGGAC TGATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CC ACCCACAA AG AAACAAAG CAAATTTC AT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAGATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC CCCATTGATT TTTAACCTCA AAATGGTGTG 3660 

30 AGATTTACTG TGGAACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTCCTT AAAATTCCCC TTTTTTCTCT ATGTACGATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGGATGAGAT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAGAAGG A GGTCGTTTAT GTGTGCAGAC 3900 
AATTCTCCCT GAGGTTAGCC CAATGGAGAA ATGAAGCAGA GGAAGGAAAC ATAGAAAGAC 3960 

3 5 ATGGGCTATC AGGG AGG AAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGAATGGA CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
ATGTTCTTGG TGTTATCAGG GTTAAGCCCT GTA ATTATGT AACCTATTTA TOGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTATGAAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTGAT ATGCTGAG AA ATTATTAG AT TGCCAATACT 4260 

40 CATGTGCGTT TCATGTGTTT TATA AGGTTT GTTCCTTTG A AG AATTGTAG TTCTTAGTCC 4320 
CAC AGGG AAA TGTGTATCTA TTTATATATC ATAGTATAAA TCTATGATAT ATTTATATCA 4380 
TATATAAAAG TCTO AGTTCT CITTCTTAGT CCCTAATCAT GTTTCTCCCA TAGGCTGTGT 4440 . 
TTACATGG AG CTATCGGTTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAGAA ATTTTAG ATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 

45 AAAGACTTAT GTCTTGGACC TATCAAAAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGGATCA ACAATG ATTT TCTTGAATGG GCATGAATGG AGATGCCCGC ACAGTAATGT 4680 
AG AAATGTTT CATACAGCTA TTAAAATGTA ACTGACCTCC TTAGAGGCAG ATTAGTAACT 4740 
GTTCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TTGGGTCTCT GGTCCTGTGT CTTCACCTCA TTTATAGCAC GTCTCCTTGA TTTTTGGTAG 4860 

50 TATCAACTTC CCAGTGATCT GTTCAGTTAA GTTCTTCTCC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTG AAATAA AAATTATGAA ATATGGTGAG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 

55 AGCAAGAAGA ATTGACTGAT TTACAGGACT TCTCTTTATG TCAATCTTAA GAGGATGGAT 5220 
GAATCTGGAC ATTTGTTCCA CCCGACCTCT GACTGATGGT TTGG AAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTG AAA AAGGAAAAAT GTAGACTCTG ACTTCCGTCC CACTGAAGGA 5340 
TTAATG AAAA CCTTTACTAG CATTTAGAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG Gil 111 HIT TTTTTTTTTT 5460 

60 TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAG AG AGGATCTAGG ATGGGAGAGC TAGAAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTG AAAAC AAACTTCTCG CAACTGAAGG 5760 

65 AAGGCTGAAG GCTGCTGCAA GTCATTGAGT GACTTTAGGA TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC CCTATGTGTA TAGTACCAGA AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AAC AGCAGAG CTCCAGGG AG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 

70 TGGCTCAAAC OCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGAAGGO GATGGGAAAT AAAGAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTCAGCGTG GTTTTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GTATCCAGTA CTTTATAACC AAAGCAATTA AATCATATTG GGGTAGGGAA TGTTGGCCAG 6300 

_ _ TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 

75 CGCCCCGAAG AGGGAGACAG AGATGTGCCA GAGTTGACCC AGTGTGCGOA TGATAACTAC 6420 
TGACG AAAGA GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTO TCTCCCTGGC AAGGAG AATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTGA GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGGAGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TGAGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT GAATAAAAG A ATAAAAAAAA 6720 
AAAAAAAAAA AAAAA 



SEQIPf»i1«PmgPWWtowqwPWi 
Protein Accession*: NP.054644.1 

1 11 21 31 41 51 
1 I I I I 1 

MGILSVDLLI TLQILPVFFS NCLFLALYDS VILLKHWLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFLLD AYKQVKLGED APNSSWHVS STEGGDNSGN GTQEKIAEOA TCHLLDFASP 120 
ERPLWNFGS ATXPPFTSQL PAFRKLVEEF SS VADFLLVY IDEAHPSDGW AIPGDSSLSF 180 
EVKKHQNQED RCAAAQQLLE RFSLPPQCRV VADRMDNNAN IAYGVAFERV CIVQRQKIAY 240 
LGGKGPESYN LQEVRHWLEK NFSKRXKKTR LAG 



SEQ ID N0:133 PFH5 DNA SEQUENCE 

Nucleic Add Accession «: NM.001141 

Coding sequence: 72-2102 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CAGGCGTGTC CCAGGGGGAG CCCCGCTCTG CAGCCCTGTG CGCCGTAGAG AGCTGGACTT 60 
AGGCTGGCAG CAJjQGCCG AG TTCAGGGTCA GGGTGTCCAC CGGAG AAGCC TTCGGGGCTG 120 
GCACATGGGA CAAAGTGTCT GTCAGCATCG TGGGGACCCG GGOAGAGAGC CCCCCACTGC 180 
CCCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GGAGGACTTC CAGGTGACGC 240 
TCCCGGAGGA CGTAGGCCGA GTGCTGCTGC TGCGCGTGCA CAAGGCGCCC OCAGTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC CCGGATGCCT GGTTCTGCCG CTGGTTCCAG CTGACACCGC 360 
CGCGGGGCGG CCACCTCCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGGA GGAGCTTCAG GCCCGGCAGG AGATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGGAT GAAAAGACAG TGGAAGACTT GG AGCTCAAT ATCAAATACT 600 
CCACAGCCAA GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GG AGGAGTCT GAATGAGATG AAAAGGATCT 720 
TCAACTTCCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTCGCCTC CCAGTTCCTG AATGGTCTCA ACCCTGTCCT GATCCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTGATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TGAGCTAGAG AAGGGCTGCC TGTTCTTGGT GGATCACGGC ATCXTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGACCCTGC 1020 
TATACCAGAG CCCAGGCTGC GGGCCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTGATGACAA GTGGGACTGO TTGCTGGCCA 1 140 
AGACCTGGGT GCGCAATGCC GAGTTCTCCT TCCATGAGGC CCTCACGCAC CTGCTGCACT 1200 
CACATCTGCT GCCTG AGGTC TTCACCCTGG CTACCCTGCO TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTC AA GCTGCTGATC COGCACACCC G ATAC ACCCT GCACATCAAC ACACTOGCCC 1320 
GGGAGCTGCT TATCGTGCCA GGGCAGGTGG TGGACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTICTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTCTGC 1440 
CTGAGGATAT CCGGACCCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATO 1500 
GG ATGCAGAT TTGGGGTGCA GTGG AACGCT TTGTCTCTG A AATCATCGGT ATCTACTACC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAACCAG GAGAGCTCAG GTATCCCTTC CTCACTGGAG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTG ATAT TCACCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTCC TGTGCTTGGA TGCCCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTG 1860 
TCAATGCCAC ATGTGATGTC ATCCTTGCTC TCTGGTTGCTGAGCAAGGAG OCTGGAGACC 1920 
AAAGGCCCCT GGGCACCTAT CCGGATGAGC ACTTCACAGA GGAGGCCCCT CGGCGGAGCA 1980 
TCGCCACCTT CCAGAGCCGC CTGGCCCAGA TCTCGAGGGG CATCCAGGAG CGGAACCGGG 2040 
GCCTGGTGCT GCCCTACACC TACCTAGACC CTCCCCTCAT CGAGAACAGC GTCTCCATCI 2100 
MATCCCAGG GG AACACAGG CCCAGATGAC ATCCCTTTGA CCACATCGCT CTAGGATAAC 2160 
TGGCACCCAG AGAAAAGGAC TCCTCAGAAA AAACAGGCCC CCATGTGCCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGCCCACC TTGAGGGTTT TGCTAGTTGG 2400 
TTTTGTTTTG CGTTTACAGC CGTGGGGGGA AGCACATAAT CCCGCCCCAG GGCCC ACTAG 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGG AC AGCCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGCCC ATAATCCCAG CACTTTGGGA GATGGAGGCG 2580 
GGAAAATCAT TTGAGGTCAG AAGTTCAAGG CCAGCCTGGA CG ACATAGCG AGACTCCACC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



Seq P PFH5 Protfli asmroe; 

Protein Accession #: NP.001 132.1 
1 11 21 31 41 51 

MAEFRVRVST GEAFGAGTWD KVSVSIVGTR GE5PPLPLON LGKEFTAGAB EDFQVTLPED 60 
VGRVLLLRVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HLLFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNIKYSTAK 180 
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10 



15 



NANFYLQAGS AFAEMKIKGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL IRRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGELSGIQT 300 
NVDMGKPQFS AAPMTLLYQS PGCGPLLPLA IQLSQTPGPN SPIFLPTDDK WDWLLAKTWV 360 
RNAEFSFHEA LTHLLHSHLL PEVFTLATLR QLPHCHPLFK LUPHTRYTL HINTLARELL 420 
IVPGQVVDRS TGIGIEGFSB LiQRNMKQLN YSLLCLPEDI RTRGVEDIPG YYYRDDGMQI 480 
WGAVERFVSE IIGIYYPSDE SVQDDRELQA WVREIFSKGF LNQESSGIPS SLETREALVQ 540 
YVTMVIFTCS AKHAA VS ACQ FDSCAWMPNL PPSMQLPPPT SKGLATCEGF IATLPPVNAT 600 
CDVILALWLL SKEPGDQRPL GTYPDEHFTE EAPRRSIATF QSRLAQISRG IQERNRGLVL 660 
PYTYLDPPUENSVSI 



SEQ ID NO:135 PFH4 DNA SEQUENCE 

Nucleic Add Accession*: NM.002742 

Coding sequence: 236-2974 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I I I I I 
„ G AATTCCTTC TCTCCTCCTC CTCGOCCTTC TCCTCGCCCT CCTCCTCCTC CTCGCCCTCC 60 
20 CCTCCCGATC CTCATCCCCT TGCCCTCCCC CAGCCCAGGG ACTTTTCCGG AAAGTTTTTA 120 
TrtTCCGTCT GGGCTCTCGG AGAAAGAAGC TCCTGGCTCA GCGGCTGCAA AACTTTCCTG 180 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCCCGGCCCT GCGCCCCGCC GAGCGATGAG 240 
CGCCCCTCCG GTCCTGCGGC CGCCCAGTCC GCTGCTGCCC GTGGCGGCGG CAGCTGCCGC 300 
_ AGCGGCCGCC GCACTGGTCC CAGGGTCCGG GCCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 
25 CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAGATC GGCCTGAGCC GTGAGCCXJGT 420 
GCTGCTGCTG CAGGACTCGT COGGGG ACTA CAGCCTGGCG CACGTCCGCG AGATGGCTTG 480 
CTCCATTGTC GACCAGAAGT TCCCTG AATG TGGTTTCTAC GG AATGTATG ATAAGATCCT 540 
GCTTTTTCGC CATG ACCCTA CCTCTGAAAA CATCCTTCAG CTGGTGAAAG CGGCCAGTGA 600 
_ rt TATCCAGGAA GGCGATCTTA TTGAAGTGGT CTTGTCACGT TCCGCCACCT TTGAAGACTT 660 
30 TCAGATTCGT CCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 

CTGTGGAGAA ATGCTGTGGG GGCTGGTACG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAA AT ACCCAACAAT TGCAGCGGTG TGAGGCGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT CAGCACC ATC CGCACATCAT CTGCTG AACT 900 
rtjr CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAGAGT CGTTTATTGG 960 
35 TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGACGA CCAATTCACC TTGACAAGAT 1020 
TTTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA GCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA I 140 
AGATTGCAG A TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
A _ CGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 
40 AGAAGGGAGT GATGACAATG ATAGTGAAAG GAACAGTGGG CTCATGGATG ATATGGAAGA 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAG ACC ACGAGGACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACGAAGAGGA AAAGCAGCAC 1500 
Ae AGTCATG AAA G AAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAACGGCA 1560 
45 CTATTGG AG A TTCG ATAGCA AATGTATTAC CCTCTTTCAG AATGACACAG G AAGC AGGTA 1620 
CTACAAGGAA ATTCCTTTAT CTGAAATTTT GTCTCTGGAA CCAGTAAAAA CTTCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CGAAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTG TTCTCACCAG 1800 
TGGCGTTGGT GCAGATGTGG CCAGGATGTG GGAGATAGCC ATCCAGCATG CCCTTATGCC 1860 
50 CGTCATTCCC AAGGGCTCCT CCGTGGGTAC AGGAACCAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AGAAAATGTG GACATCAGCA CAGTATATCA 1980 
GATTTTTCCT GATGAAGTAC TGGGTTCTGG ACAGTTTGGA ATTGTTTATG GAGG AAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACGAT TTCCAACAAA 2100 
ACAAGAAAGC CAGCTTCGTA ATGAGGTTGC AATTCTACAG AACCTTCATC ACCCTGGTGT 2160 
55 TGTAAATTTG GAGTGTATGT TTGAGACGCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGGAGAC ATGCTGG AAA TO ATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AGATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT GACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTGATC CTTTTCCTCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG GATCATTGGA GAGAAGTCTT TCCGGAGGTC 2460 
60 AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAAC AAGG GCTACAATCG 2520 
CTCTCTAGAC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGCCTAAGCG GCACATTCCC 2580 
ATTTAATGAA GATGAAGACA TACACGACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATCCCTGG AAGGAAATAT CTCATGAAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
- _ AAAAATGAGA AAGCGCTACA GTGTGGATAA GACCTTGAGC CACCCTTGGC TACAGGACTA 2760 
65 TCAGACCTGG TTAGATTTGC GAG AGCTGG A ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTGAT GACCTG AGGT GGGAG AAGTA TGCAGGCGAG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTG CTAGCCACAG TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTG AGCGTG TCAGCATCCT CIQAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 
„ CACTGTGGAA CTAATAAATA CATACGGTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 
70 TATTTTCTGT CAGATGAGAA CAAAGCTGTT AAACTGTTAG CACTOTTGAT GTATCTGAGT 3120 
TGCCAAGACA AATCAACAG A AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3 180 
AAAGTTCCCT GAAACACGAA ACTTGTTATT GT GAATGAT T CATGTTATAT TTA ATGCA TT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGGAGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 
75 TGTAAACAAA CTCTTGAAGA GTCGATTATT TCCAGTGTTC TATGAACAAC TCCAAAACCC 3420 

ATGTGGGAAA AAA ATG AATG AGGAGGGTAG GG AATAAAAT CCTAAG ACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAOACA ATGCACCTAO CTOTGCAAGA CCTAG TGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAAACT GCCATATATA AC AG ATAC AT TTCCCTCTTT CTTATAATAC TCTGTTGTAC 3660 
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TATGGAAAAT CAGCTGCTCA GCAACCTTTC ACCTTTGTGT ATTTTTCAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



5 ?PQiPH0;1?gPFmPfPtelnggqMgDgg; 

Protein Accession «: NPJW2733.1 

1 11 21 31 41 51 
10 | | | | | | 

MSAPPVIJRPP SPLLPVAAAA AAAAAALVPO SGPGPAPFLA PVAAPVGGIS FHLQIGLSRE 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK ILLFRHDPTS ENILQLVKAA 120 
SDIQEGDLDB VVLSRSATFE DFQIRPHALF VHSYRAPAFC DHCGEMLWGL VRQGLKCEGC 1 80 
, r GLNYHKRCAF KIPNNCSGVR RRRLSN VSLT G VSTIRTSSA ELSTS APDEP LLQKSPSESF 240 
15 IGREKRSNSQ 5YIGRPIHLD KtLMSKVKVP HTFVIHSYTR PTVCQYCKKL LKGLFRQGLQ 300 

CKDCRFNCHK RCAPKVPNNC LGEVTINGDL LSPG AESDVV MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDPDPDHE DANRTISPST SNNIPLMRW QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLDSKC ITLFQNDTGS RYYKEIPLSE ILSLEPVKTS 480 
ALIPNGANPH CFEITTANVV YYVGENWNP SSPSPNNSVL TSGVGADVAR MWHIAIQHAL 540 
20 MPVIPKGSSV GTGTNLHRDI SVSISVSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG 600 

KHRKTGRDVA IKHDKLRFP TKQESQLRNE VAUQNLHHP GWNLECMFB TPERVFWMB 660 
KLHGDMLEMI LSSEKGRLPE HITKFUTQI LVALRHLHFK NIVHCDLKFE NVLLASADPF 720 
PQVKLCDFGF ARIIGEKSFR RSVVGTPAYL APEVLRNKGY NRSLDMWSVG VDYVSLSGT 780 
_ FPFNEDEDIH DQIQKAAFMY PPNPWKEISH EAIDUNNLL QVKMRKRYS V DKTLSHPWLQ 840 
25 DYQTWLDLRE LECKIGERY1 THESDDLRWE KYAGEQRLQY PTHLINPSAS HSDTPETEET 900 
EMKALGERVSIL 



30 SEQ 10 NO:137 PFH3 DNA SEQUENCE 

Nucleic Acid Accession*: X95425 

Coding sequence: 712-3825 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

35 | | | | | | 

AATGGTCAGT CAATACATTA TAACATAATA CACCAAATGC TAGAATAG AA GGGGAGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAG AAG GGTGCATCAG TGAATTAAAA AATGTCCCTC 120 
CCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GTCTCTTTGA AC TCTGO ATC 180 

Ark TTTGCTTTTG CTCGCTGCTC TCCTGTTTTT CATTCTCCAC ATTTTCTCAA TCCTCTTTCT 240 

40 TTATCCTTAG CCACCCTGCT TTTTTCCTCC TTTTTTAAAA AATCGG AGAT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTCCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGACCC AGACCCTCTC G ACACCCTTG ATCCGAGTCA GATCTGCACT AGCAACCAGA 480 
ACTAATATTT CATTTAACCC ACCAAAAGGG GGAGGCGAGA GG AGCCAG AA GCAAACTTCA 540 

45 TCTGTCTCAG ACGGATCCGT GGTTCCTACA TTTGGAGGAG CCGCGTOTCA OAAGGCGTAG 600 
GACCCCAAGG GGGGACAAGG AGGACTCCCG AGTCTCCCTT CTCCGCTCTC CGAGACCGAA 660 
GAGGTGGACT GAGCCGCTCG GGACAGCGGC ACCGGAGGAG GCTCGGAGAA GAJ£CGGGGC 720 
TCGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
AGOCCAGCGT CCCTGGCCXJG CTGCTACTCT GCACCTCGAC GGGCTCCCCT CTGGACGTGC 840 

50 CTTCTCCTGT GCGCCGC ACT CCQGACCCTC CTGGCCAGCC CCAGCAACGA AGTGAATTTA 900 
TTGGATTCAC GCACTGTCAT GGGGGACCTG GGATGGATTG CM T ICC AAA AAATGGGTGG 960 
G AAG AG ATTG GTG AAGTGGA TGAAAATTAT GCCCCTATCC ACACATACCA AGTATGCA AA 1020 
GTGATGGAAC AGAATCAGAA TAACTGGCTT TTGACCAGTT GGATCTCCAA TGAAGGTGCT 1080 
TOCAGAATCT TCATAO AACT CAAATTTACC CTGCGGGACT GCAACAGCCT TCCTGGAGGA 1140 

55 CTGGOGACCT GTAAGG AAAC CTTTAATATO TATTACTTTG AGTCAGATGA TCAGAAT GGG 1200 
AGAAACATCA AGGAAAACCA ATACATCAAA ATTGATACCA TTGCTGCCGA TGAAAGCTTT 1260 
ACAGAACTTG ATCTTGGTGA CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGGA 1320 
CCTCTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CATTGCTCTG 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC CCTTCTGTGG TACGACACTT GGCTGTCTTC 1440 

60 CCTGACACCA TCACTGGAGC TGATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCIGTGTC 1500 
AACCATTCTG TGACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATG AAG AG AAAAATGG CACCTGTCAA 1620 
GTGTGCAGAC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATGA GGAAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 

65 AGGAGAGAGT CTGATCCACC CACAATGGCA TGCACAAGAC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTCCGCC TGCTG ACACT I860 
GGTGGAAGGA AAGACGTGTC ATATTATATT GCATGCAAGA AGTGCAACTC CCATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGO TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGGA TCTACTCGCT CACACAAACT ATACCTTTGA GATTGAGGCA 2040 

70 GTGAATGG AG TGTCCGACTT GAGCCCAGG A GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAA AT TGCAAAAAAC 2160 
AGCATCTCTT TGTCTTGGCA AGAACCAG AT CGTCCC AATG GAATCATCCT AOAGTATGAA 2220 
ATCAAGCATT TTGAAAAGGA CCAAOAGACC AGCTACACGA TTATCAAATC TAAAGAGACA 2280 
ACTATTACTG CAGAGGGCTT GAA ACCAGCT TCAOTTTATG TCTTCCAAAT TCGAGCACGT 2340 

75 ACAGCAGCAG GCTATGGTGT CTTCAGTCGA AGATTTGAGT TTGAAACCAC CCCAGTGTTT 2400 
GCAGCATCCA GCGATCAAAG CCAGATTCCT GTAATTGCTG TGTCTGTGAC AGTAGGAGTC 2460 
ATTTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGGAA GTTGCTGCGA ATGTGGCTGT 2520 
GGG AGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AGAAGAGGAA AAGATGCATT TTCATAATGG GC AC ATT AAA 2640 
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CTOCCAGGAO TAAGAACTTA CATTOATCC A CATACCTATG AGGATCCCAA TCAAOCTGTC 2700 
CACOAATTTO CCAAGOAOAT AGAAGCATCA TGTATCACCA TTG AG AG AOT TATTGG AGC A 2760 
GGTGAATTTG GTGAAGTTTG TAGTGGACGT TTG AA ACT AC CAGGAAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTCCTAGGT 2880 
GAAGCAAGTA TCATGGG AC A GTTTG ATCAT CCTAACATCA TCCATTTAG A AGGTGTGGTG 2940 
ACCAAAAGTA AACCAGTGAT GATCGTGACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
TTTTTGAAO A AAAACGATGG OCAGTTCACT GTGATTCAGC TT G TTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTCCCGG 3180 
GTACTGGAAG ATGATCCCGA GGCAGCCTAC ACCACAAGGG GAGG AAAAAT TCCAATCAG A 3240 
TGGACTGCCC CAGAAGCAAT AGCTTTCCGA AAGTTTACTT CTGCCAGTGA TGTCTGGAGT 3300 
TATGOAATAG TAATGTGGGA AGTTGTOTCT TATGGAGAGA GACCCTACTG GGAGATGACC 3360 
AATCAAG ATG TGATTAAAGC GGTAGAGGAA GGCTATCOTC TGCCAAGCCC CATGG ATTGT 3420 
CCTGCTGCTC TCTATCAGTT AATGCTGGAT TGCTGGCAGA AAGAGCGAAA TAGCAGGCCC 3480 
AAGTTTGATG AAATAGTCAA CATGTTGGAC AAGCTG ATAC GTAACCCAAG TAGTCTG AAG 3540 
ACGCTGGTTA ATGCATCCTG CAGAGTATCT AATTTATTGG CAGAACATAG CCCACTAGGA 3600 
TCTGGGGCCT ACAG ATCAGT AGGTGAATGG CTAGAGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGG AAAATGG ATACAGTTCA ATGGACGCTG TGGCTCAGGT GACCTTGGAG 3720 
GATTTGAGAC GGCTTGG AGT GACTCTTGTC GGTCACCAGA AGAAG ATCAT GAACAGCCTT 3780 
CAAG AAATG A AGGTGCAGCT GGTAAACGGA ATGGTGCCAT TGTj^ACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SEQ ID NO:138 PFH3 Protein sequence: 
Protein Accession #i CAAS4700.1 



1 11 21 31 41 51 
1111)1 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CYSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGWIAFPK NGWEBIGEVD ENYAPIHTYQ VCKYMEQNQN NWLLTSWISN 120 
EGASRDFIEL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNKENQ YIKIDTIAAD 180 
ESFTELDLGD RVMKLNTEVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPS WRHL 240 
AVFPDTTTGA DSSQLLEVSG SCVNHSVTDE PPKMHCS AEG EWLVPIGKCM CXAGYEEKNO 300 
TCQVCRPGFF KASPfflQSCG KCPPHSYTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAISNVNE TSVFLEWIPP ADTGGRKDVS YYIACKKCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTSVMMVD IXAHTNYTFE EAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNSISLSWQ EPDRPNGUL EYHKHFEKD QETSYTUKS KETTITABGL KPASVYVPQl 540 
RARTAAGYGV FSRRFEFETT PVFAASSDQS QIPVIAVSVT VGVILLAWI GVLLSGSCCE 600 
CGCGRASSLC AVAHPDLIWR CGYSKAKQDP EEEKMHFHNG MKLPGVRTY IDPHTYEDPN 660 
QAVHEFAKEI EASCJTIERV IGAGEPGEVC SGRLKLPGKR ELPVAIKTLX VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE GWTKSKPVM IVTEYMENGS LDTFLKKNDG QFTV1QLVGM 780 
LRGIS AGMKY LSDMGYVHRD LAARNHJNS NLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
PJRWTAPEAI AFRKFTS ASD VWSYGIVMWE WSYGERPYW EMTNQDVQCA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDETVN MLDKURNPS SLKTLVNASC RVSNLLAEHS 960 
PLGSGAYRSV GEWLEAIKMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKKIM 1020 
NSLQEMKVQL VNGMVPL 



SEQ ID N0:139 PFH2 DNA SEQUENCE 

Nucleic Acid Accession*: NMJH6029 

Coding sequence: 76-1097 (underlined sequences correspond to start and slop codons) 
1 11 21 31 41 51 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 
GGGCGTGCGC GGCCGCAATQ AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 
TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 
TATGGGCCGA GTGGCAGGG A CGACGCCCAO AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
TTTCTCTTGT GCTGTCAGCC AG AAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 
TAGAGAATGG CAATTTAAAA G AAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGG ATGTCT 540 
ACAG AAAGCT AATAGAGCTT AACTACTTAO GGACGOTGTC CTTGACAAAA TGTGTTCTGC 600 
CTCACATG AT CG AG AGG AAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGT ATCA 660 
TATCTGTACC TCTTTCCATT CGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 
ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
OACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTG ATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAA CAAG ATG GGOAA GA 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGC AG ACTC TTCTT ATTTT AAAATCTTTA 1080 
AG ACAAAACA TO AdGAAAA GAGCACCTGT ACTTTTCAAO CCACTGG AGO GAG AAATGG A 1140 
AAACATG AAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 
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ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATOA AATAAAAAAT A AATAATAAA 1260 
AG ATTGCCAT G AATCTTGCA AA 



5 



SEQ ID NO:140 PFH2 Protein seouence: 
Protein Accession*: NP.057113.1 



1 11 21 31 41 51 

10 1 | | | | | 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 
GIGEELAYQL SKIjGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMER 180 
KQGK1VTVNS UjGIISVPLS IGYCASKHAL RGFFNGLRTB LATYPGIIVS NICPGPVQSN 240 

15 IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMUSMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWITN KMGKKRIENF KSGVDADSSY FKIFKTKHD 



20 SEQ ID N0:141 PFH1 DNA SEQUENCE 

Nudetc Add Accession #: NM.021614 

Cooing sequence: 1-1740 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
25 | | | | | | 

AJQAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGCCGGAACC TGCACGAGAT GGACTCAGAG GCGCAGCCCC TGCAGCCCOC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG CCGCCGCCGC CGCTGTTTCG 180 
„ TCCTCAGCCC CCGAGATCGT GOTGTCTAAG CCCGAGCACA ACAACTCCAA CAACCTGGCG 240 
30 CTCTATGGAA COGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC CAAGTCCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGCCACC GGCGCGCCCT GTTCGAAAAG CGCAAGCGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGCATCGT GGTCATGGTC ATCGAGACCG AGCTGTCGTG GGGCGCCTAC 480 
„ GACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGCCTTA TCAGTCTCTC CACGATCATC 540 
35 CTGCTCGGTC TGATCATCGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 600 
GGAGCAGATG ACTGG AGAAT AGCCATGACT TATGAGCGTA TTTTCTTCAT CTGCTTGGAA 660 
ATACTGOTGT GTGCTATTCA TCCCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCCC ATCCACAACC ACCGCTG ATG TGGATATTAT TTTA TCTATA 780 
A _ CCAATGTTCT TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAACTTTTC 840 
40 ACTGATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACGTTTT 900 
GTTATGAAGA CTTTAATGAC TATATGCCCA GGAACTGTAC TCTTGGTTTT TAGTATCTCA 960 
TTATGGATAA 1TGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGGAT 1020 
GTTACTAGCA ACTTCCTTGG AGCGATGTGG TTGATATCAA TAACTTTTCT CTCCATTGGT 1080 
. - TATGGTGACA TGGTACCTAA CACATACTGT GGAAAAGGAG TCTGCTTACT TACTGGAATT 1 140 
45 ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTGACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAATTT ACAAAAATAC AAAGCTAGTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
„ CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATG ACCAAGCAAA CACTTTGGTG 1440 
50 GACTTGGCAA AG ACCCAGAA CATCATGTAT GATATGATTT CTG ACTTAAA CGAAAGG AGT 1500 
GAAGACTTCG AGAAGAGG AT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATCCACGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAGA TGGAGAGCTA CGACAAGCAC GTCACTTACA ATGCTQAGCG GTCCCGGTCC 1680 
TCGTCCAGG A GGCGGCGGTC CTCTTCC AC A GCACCACC AA CTTCATCAGA GAGTAGCJAQ 
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SEQ ID NO:142 PFH1 Protein sequence: 
Protein Accession ft NP_067627 



60 1 11 21 31 41 51 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPAS V GGGGGASSPS AAAAAAAAVS 60 
SSAPEIWSK PEHNNSNNLA LYGTGGGGST GGGGGGGGSG HGSSSGTKSS KKKNQNIGYK 120 
LGHRRALFEK RKRLSDYALI FGMFGIWMV ETELSWGAY DKASLYSLAL KCUSLSTI1 180 

65 LLGLUVYHA RBQLFMVDN GADDWRIAMT YERIFFICLE ILVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TADVDIILSI PMFLRLYLIA RVMLLHSKLF TDASSRSIGA LNKINFNTRF 300 
VMKTLMTICP GTVLLVFSIS LWHAAWTVR ACERYHDQQD VTSNFLGAMW USITFLSIG 360 
YGDMVPNTYC GKGVCLLTGI MGAGCTALW AVVARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WLTYKNTKLV KKIDHAKVRK HQRKFLQAM QLRSVKMEQR KLNDQANTLV 480 

70 DLAKTQNIMY DMISDLNERS EDFEKRIVTL ETKLETLIGS IHALPGLISQ TTRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



75 SCO ID N0:143 PFG9 DNA SEQUENCE 

Nucleic Acid Accession*: AL1101 39, coring region Is FGENESH predicted 
Coding sequence: 1-1896 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
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I I I I I I 

AJ£CGCGCCG TGCCGCTGCC CGCCCCGCTC CTGCCGCTGC TGCTGCTCGC GCTCCTGGCC 60 
GCTCCCGCCG CCCGCGCCAO CAGAGCCO AG TCCOTCTCCG CGCCGTGGCC CGAACCCGAG 120 
CGCGAGTCGC GGCCACCGCC CGGCCCGGGG CCCGGGAACA CCACCCGGTT TGGGTCTGGG 180 
5 GCGGCGGGCG GCAGCGGCAG CTCCAGCTCC AACAGCAGTG GCGACGOCTTGGTGACOCGC 240 
ATTTCCATCC TCCTCCGCG A CCTACCCACC CTCAAGGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGCGCGTCT TCAGGTCGGG AAAGAGGTTA 360 
AAGAAGACAC GCAAGTATGA TATCATCACC ACTCCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATG ATGA AGATGAGGAC TCCAC AGTAT TCGACATCAA ATACAG AGTG 430 

10 TCCTTGCCGG CTGCACTG AG ACGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CCTTCATCCT CG ACATTGAC CTTCCAGCAA GATGCAGTGG AAGGCCTGAT 600 
GGTGGAATCA GACCTGGTAA AACCTGTTTC CCAGCCTGGT GGCATCCTGT GG AAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGGACTGG ACCTGGAAGC CCTCTTGCGT CGOAGGTOTT 120 
GAAACCAAAA CGAACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 

15 TCAGACTGTC ACTGGCAAGC TCGTTTGCAC GTCACCACAA TGGAGTTGCT TCTGCCACCC 840 
TTTGGGCATC CCTTTAAAGT GCCCCCTACT TCTACTCCCC ATGGTTTTCG ACAACTGCAG 900 
CTGAATCTCA TGG AAAAGCT GGATTCCTCT GCCTTACGCA GAAACACCCG GGCTCCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 

„ CCTTGGTGGC ACTTC AGCGC CACAGGCTCT CCAATAAAAA CCCTTTACAC ACAAACCATG 1080 

20 AGTACCTTGG GCTTGG ATGT TTTCTGTGGT GCCGGCCAGC GGGGCACCTT TTGTG AAGAC 1140 
AGAGCAGTGA CTAAGGTTCT CCAGGGTAGC TCITTCTCCA AACAGCTGCG CTGGAAGCCA 1200 
GCCCTAGAGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TCCGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTTCAGAT GCCCGGGOAC AAGCCAGCCT GACGGGGAGG 1320 
AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TACCGCAACT 1380 

25 TGCCTTTTGG TTTTGAAG AT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTTCTACAAA 1440 
ATCTGTCTCC CCTGCTGTGC CGTGGAACAC CTACGGGAAG CCAAGAGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTG AGCA GAGCCCACAA AAGGCAGCTG CTGCCCACGG GGAGCCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAGA CACACATGCC CTGGCTGGGG GATCACACAT 1620 

_ GCGAACCTGC AGACAATTCC AGATACCCAA GGCCAGGAAG GCCCACGTGA GGATGTCACT 1680 

30 CACCCTGGAG GAG ACTTGG A TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCCAG 1740 
GATGGCAGAT GCCAGAAG AT GGTCCTG ATG TCTGAGG AAG GGCCACCTAG TTTGACAGGA 1 800 
TGTG AGAGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTCCTTCCTT 1860 
TCCCCCCGAC AGCCCCTGTT TCTGTCCAGG CCCTGA 

35 

SEQ ID WO:144 PFG9 Protein sequence: 
Protein Accession #: none available, FGENESH predicted 

1 11 21 31 41 51 
I I I I I I 

MRAVPLPAPL LPLLLLALLA APAARASRAE SVS APWPEPE RESRPPPGPG PGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDLPT LXAAVTVAFA FTTLLIACLL LRVFRSGKRL 120 
KKTRKYDHT TPAERVEMAP LNEEDDEDED STVFDIKYRV SLPAALRRQL PGCQTLLTVP 180 
VPPPFILDID LPARCSGRPD GGIRPGKTCF PAWWHPVESW SAATWGVKDW TWKPSCVGGV 240 
ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQIjQ 300 
LNLMEKLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDLPN PWWHFSATGS PDCTLYTQTM 360 
STLGLDVPCG AGQRGTPCED RAVTKVLQGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THPVRLARSD ARGQASLTGR RVFRRPRQSL HGGGSAGTAT CLLVLKILLR RHPHLDLFYK 480 
ICLPCCAVEH LREAKRSSVT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGITH 540 
ANUQTJPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHF SSHSKSWSFL SPRQPLFLSR P 



SEQ ID NO:145 PFG6 DMA SEQUENCE 
55 Nucleic Acid Accession #: KM.013427 

Coding sequence: 876-3799 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

60 GGCTGGGCTG CGAATAGCGT GTTCCTC70C GGCGG AACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT CCACGGAGAG CGCTGAGCGC CGCCGGGAAT TCCATCCCAC 120 
CGTGGGCACG CAGTCTTTGG AGGTCCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
CAAGACAGAG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAG AGTGGGC GCAGCAGCCC AGCGGAGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 

65 AGAGAGTGCA GGGAGGCGCA GCTCAGGCGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCO 360 
GCGCCGGGAG CGCGGTGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TCGGGACTGT CCTCGGGCGG CAAGGAGGAG CTTGCTGGAG TCTT AGAGGC CATCCAGAGC 480 
CAGCGAGCAG GAGCGCTGCG TCTCCCGCCT CAGCTAGG AA GGOGGAGTGG CGCTGGCAGG 540 
CTGG AGCTGG G AACCCAGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTTCGC 600 

70 GTCTTGGGCT CCGGAGGAAG GTTCTAGCGG CTGCAGGAGG TCCCCAGAOC CATTTTCCTA 660 
GAAGGCTGGT GATGGATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGGAG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTO 780 
GCACCTTTGC CTGAGTCCCT TTCGGTTCCC GACCCAAAGC CACCAGCGTC CAGGG AGGGA 840 
GGAGGAGGTG GTCCTCAGGT GCAGCCCCGC COA QATG TCC GCGCAG AGCC TGCTCCACAG 900 

75 CGTCTTCTCC TGTTCCTCGC CCGCTTCAAG TAGCGCGGCC TCGGCCAAGG GCTTCTCCAA 960 

GAGGAAGCTG CGCCAG AOCC GCAGCCTGGA CCCGGCCCTG ATCGGCGGCT GCGGG AGCGA 1020 
CGAGGCGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAG AGTCTCG GCCCTCGCTT GGCGTCCTCT TCCCGGGGTC CGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGGAGAAG TCACCATCCO GCAGCTTTCA CTTTGACTAT GAGGTTCCCC TGGGTCGCGG 1260 
CGGCCTCAAG AAG AGCATGG OCTGOG ACCT GCCTTCTGTC CTGGCCGGGC C AGCCAGTAG 1320 
CCGAAGCGCT TCCAGCATCC TCTGTTCATC CGGGGG AGGC OCCAATGGCA TCTTCGCTTC 1380 
_ TCCTAGGAGG TGGCTCCAGC AGAGGAAGTT CCAGTCCCCA CCCGACAGTC GCGGGCACCC 1440 
5 CTACGTCGTG TGGAAATCCG AGGGTGATTT CACCTGGAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTGAGG TCAGTCCCCA TCCAGAGTCT CTCAGAGCTG GAGAGGGCCC GGCTGCAGGA 1560 
AGTGCCTTTT TATCAGTTGC AACAGG ACTG TGACCTGAGC TOTCAGATCA CCATTCCCAA 1620 
AGATGGACAA AAGAGAAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAGGAGAA 1680 
. AAACAAAGAC AAAGAATTCA TCCCACAGGC ATTTGGAATG CCCTTATCCC AAGTCATTGC 1740 

10 GAATGACAGG GCCTATAAAC TCAAGCAGGA CTTGCAGAGG GACGAGCAGA AAGATGCATC 1800 
TGACTTTGTG GCTTCCCTCC TCCCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAACCTCAGA AACACCGAAT GAGTCAACGT CCCCAAACAC 1920 
CCCGGAACCG GCTCCTCGGG CTAGGAGGAG GGGTGCCATG TCAGTGGATT CTATCACCGA 1980 
. TCTTGATGAC AATCAGTCTC GACTACTAGA AGCTTTACAA CTTTCCTTGC CTGCTGAGGC 2040 

15 TCAAAGTAAA AAGGAAAAAG CCAGAG ATAA GAAACTCAGT CTGAATCCTA TTTACAGACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCACCTAGAA AAACATGGCC TCCAGACAGT 2160 
GGGGATATTC CGAGTTGGAA GCTCAAAAAA GAGAGTGAGA CAATTACGTG AGGAATTTGA 2220 
CCGTGGGATT GATGTCTCTC TGGAGGAGGA GCACAGTGTT CATGATGTGG CAGCCTTGCT 2280 
„ GAAAGAGTTC CTGAGGGACA TGCCAG ACCC CCTTCTCACC AGGGAGCTGT ACACAGCTTT 2340 
20 CATCAACACT CTCTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTCCAGC TCCTCATATA 2400 
CCTTCTACCT CCCTGCAACT GCG ACACCCT CCACCGCCTQ CTACAGTTCC TCTCCATCGT 2460 
GGCCAGGCAT GCCG ATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTG GGAATAAAAT 2520 
GACATCTCTA AACTTAGCCA CCATATTTGG ACCCAACCTG CTGCACAAGC AG AAGTCATC 2580 
_ _ AGACAAAG AA TTCTC AGTTC AGAGTTCAGC CCGGGCTGAG GAGAGCACGG CCATCATCGC 2640 
25 TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTCCA 2700 
GAACGAAGTG CTG ATCAGCC TGTTAGAGAC CGATCCTGAT GTCGTGGACT AT TTAC TCAG 2760 
AAGAAAGGCT TCCCAATCAT CAAGCCCTG A CATGCTGCAG TCGGAAGTTT CCTTTTCCGT 2820 
GGGAGGGAGG CATTCATCTA CAGACTCCAA CAAGGCCTCC AGCGGAGACA TCTCCCCTTA 2880 
_ TG ACAACAAC TCCCCAGTGC TGTCTG AGCG CTCCCTGCTG GCTATGCAAG AGO ACGCGGC 2940 

30 CCCGGGGGGC TCGGAGAAGC TTTACAGAGT GCCAGGGCAG 1TTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTCC TGG ACCAAGO CTTGGGAAAG ATCTGTCAGA 3060 
GGAGCCTTTC GATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAG ACCC 3120 
AGCAATGACA GGTTCCTCTG GAG ACATTTT TGAAAGCAGC TCCCTAAGAG CGGGGCCCTG 3180 
CTCCCTTTCT CAAGGGAACC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 
35 GCTGGACAGC GACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCCCG CGACGGAGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGGAA 3360 
AGCCGAGCGG GCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACGACCT 3420 
CAGCGAGAGT GAGCTGGATG TGGOCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATCACAAGCG GCCCCCGCCTCCATACCCGG GCCCAGGGAA 3540 
40 GCCCGCGGCA GCGGCAGCCT GG ATCCAGGG GCCCCCGGAA GGCGTGGAGA CACCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAGAAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGG ACAG TCCGCGCCTG GGGGACGCTG GCTGGCTCG A 3720 
CTGGCAGAGA GAGCGCTGGC AGATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCGAG ACG CTGGTC32AG OCCGCACCCA GCCG AGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
45 CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAQTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTG ACACA AG AG AAATCC AGTTCACCTA CAG AGGTAG A GCACTCACGC 3960 
CCCCGCCATT GAGAATAAGG TTCCATTGOG TAGCCAGCCT TAGGAAAAAC AAACAOAACC 4020 
CAAACCAG AT GGCAATGTCC AATCTAAAAA CGTCCCTCTT GGCTCTATAA TATAAGATAC 4080 
„ AACTCTTGCT TGGTATAGCC TAACCGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 
50 TCTGTAACAG ATTATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATA CTAAA CAATO AG ATT 4260 
CTATAG AATG TTCTAG AATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CTCTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGGAGTC AG ATACAAAA AGAAAAATCA CTGAATGCTT TTAGATATTG 4440 
55 AATACGTTTT CAGGAAAATG CTAAATCTGA TAG ATTACGA AATATATTTT TAGAACTTGT 4500 

TTAGAAAGGA TTCAGTTAAC CAAACAAOAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTAC AAACT GGAAT C CAAC TATAAAGTGT 4680 
- rt TTAAGAATCT ACACAGAATA TTCAAATTAT AGAACATGTT TTTTCCCTTT GCCCCATAAT 4740 
60 CAGTATTTOC CAAATTACAT GCAATTCCTT AAAAACTAAA TCACATTGGT AAAAGG CCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGGAAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTOTAAA ATGTTCTACC OTACTTTAGT AGTTT GAAGT TTTCAAGTG C ATAACTATTT 4920 
TTGACCAGCA GAAGGCGATA CG CTTCAGTA TTTTATGCAA 11111 TITCA CTTCGA AOGO 4980 
„ AAAGTGTATT ATAAAAAAAG ATTTTTTTTT TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 
65 TGGTGATGAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 

7U Protein Accession #: NP.038288.1 

1 II 21 31 41 51 

_ _ MSAQSLLHS V FSCSSPASSS AASAKGFSKR KLRQTRSLDP AUGGCGSDE AGAEGS ARGA 60 
75 TAGRLYSPSL PAESLGPRLA SSSRGPPPRA TRLPPPGPLC SSFSTPSTPQ EKSPSGSFHF 120 

DYEVPLGRGG LKKSMAWDLP SVLAGPASSR SASSILCSSG GGPNGIFASP RRWLQQRKPQ 180 
SPPDSRGHPY WWKSEGDPT WNSMSGRS VR LRS VPIQSLS ELERARLQEV PFYQLQQDCD 240 
LSCQITIPKD GQKRKKSLRK KLDSUGKEKN KDKEF1PQAF GMPLSQVIAN DRAYKUCQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PNESTSPNTP EPAPRARRRG 360 
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AMSVDSITDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPIYRQV PRLVDSCCQH 420 
LEKHGLQTVO ffRVGSSKKR VRQLREEFDR OIDVSLEEEH SVHDVAALLK EFLRDMPDPL 480 
LTRELYTAFI NTLLLEPBEQ LOTLQLLIYL LPPCNCDTLH RLLQFLSIVA RHADDNISKD 540 
- GQEVTGNKMT SLNLATIFGP NLLHKQKSSD KEFSVQSSAR AEESTAHAV VQKMENYEA 600 
D LFMVPPDLQN EVLI5LLETD PDVVDYLLRR KASQ5SSPDM LQSEVSFSVO GRHSSTDSNK 660 
ASSGDISPYD KNSPVLSERS LLAMQEDAAP GGSBKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRLGKDLSEB FPDIWOTWHS TLKSGSKOPG MTGSSGDIFE SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHPAVSRACS TPHVQVAGKA ERPTARSEQY 840 
A LTLSGAHDLS ESELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKF AAAAAWXQGP 900 
1U PEGVETPTDQ GGQAAEREQQ VTQKKLSS AN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPDALP ETLV 



15 SEQ ID NO;147 PFG4 DNA SEQUENCE 
Nucleic Acid Accession!: NM.002202 

Coding sequence: 240*1289 (undefined sequences correspond to start and stop codons) 

20 l U 21 31 41 51 

I I I I I I 

CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTCCTAGAT CCGCGAGGGC GCGGCGCAGC CGAGCAGCGG CTCTTTCAGC 1 20 
- ATTGGCAACC CCAGGGGCCA ATATTTCCCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 
25 GGCTGTTCAC CAACTGTACA ACCACCATTT CACTGTGGAC ATrACTCCCT CTTACAG ATA 240 
JJSGOAGACAT GGG AG ATCCA CC AAAAAAAA AACGTCTO AT TTCCCTATGT GTTGGTTGCG 300 
GCAATCAGAT TCACGATCAG TATATTCTGA GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGC l ' l ' l G 11 A 420 
nr . GGGATGGG AA AACCTACTGT AAAAGAGATT ATATCAGGTT GTACGGGATC AAATGCGCCA 480 
30 AGTGCAGCAT CGGCTTCAGC AAGAACGACT TCGTGATGCG TGCCCGCTCC AAGGTGTATC 540 
ACATCGAGTG TTTCCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTTCGGG A GG ACGGTCTC TTCTGCCG AG CAG ACC ACG A TGTGGTGG AG AGGGCC AGTC 660 
TAGGCGCTGG CGACCOGCTC AGTCCCCTGC ATCCAGCGOG GCCACTGCAA ATGGCAGCGG 720 
0 „ AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 
35 CCACCCGCGT GCGGACTGTG CTGAACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GCGCTCATGA AGG AGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCGAAGCA 960 
TCATG ATGAA GCAACTCCAG CAGCAGCAGC CCAATGACAA AACTAATATC CAGGGGATGA 1020 
CAGGAACTCC CATGGTGGCT GCCAGTCCAG AGAGACACGA CGGTGGCTTA CAGGCTAACC 1080 
40 CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT GAGCGACTTC GCCTTGCAGA 1140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGACCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT GAGGCAIQAG GAACATTCAT TCTCTA' l 1 1 1 1U 1 CCCTGT 1320 
TGG AG AAAGT GGG AAATTAT AATGTCGAAC TCTG AAACAA AAGTATTTAA CG ACCC AGTC 1380 
45 AATGAAAACT GAATCAAGAA ATGAATGCTC CATGAAATGC ACGAAGTCTG TTTTAATGAC 1440 
AAGGTGATAT GGTAGCAACA CTGTGAAGAC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACA AAAOG CAAAACCCAG TATATGCTAT TCAATGATCT TAG AAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAGAG GATTTATATT CAAGGATCTC AAAGAAAGCA TTTTCATTTC 1620 
„ ACTGC AC ATC TAGAG AAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
50 TGCTGTTTCT ATATTGGTCA TTGCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AGAGACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GTTGACTTTA ACAAGGGGTT AATTGAAATC CTGGGTCTCT TGGCCTGTCC 1860 
TGTAGCTGGT TTATTTTTTA CTTTGCCCCC TCCCCACTTT TTTTG AGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGGTTTT TGAATTCAGA TTTAAAAACC AACTTATAAA 1980 
55 GCATTOCAAC AAGGTTACCT CTATTTTGCC ACAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GGAAATAAAA AGGAAAAAAA AAAGGAAACT TTTTTTGTTT GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGGAAG ACTTGCCACT TTTCATGTCA 2220 
- TTTGACATTT TTTGTTTGCT GAAGTGAAAA AAAAAGATAA AGGTTGTACG GTGGTCTTTG 2280 
00 AATTATATGT CTAATTCTAT GTOTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GA ATTATATC TTCAGGACTA TTTCACTAAT AAACATTTGG CATAGAT 



65 SEQ tD NO:148 PFQ4 Protefn s&ouence: 
Protein Accession #: NP.002193.1 

1 11 21 31 41 51 
- A I I I I I I 

/U MGDPPKKKRL ISIXVGCGNQ IHDQYILRVS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS IGFSKNDFVM RARSKVYHE CFRCVACSRQ UPGDEFALR 120 
EDGLFCRADH DVVERASLGA GDPLSPLHPA RPLQMAAEP1 S ARQPAtRPH VHKQPEKTTR 1 80 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VIRVWFQNKR CKDKKRSIMM 240 

__ KQLQQQQPND KTNIQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFALQSDl 300 

75 DQPAFQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMV ASPIEA 
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SEO ID NO:149 PFG2 DNA SEQUENCE 

Nucleic Acid Accession!: NMJJ01172 

Coding sequence: 39-1 103 (underlined sequences correspond to start and stop codons) 

1 11 2! 31 41 51 
I I I I I I 

GCGGAGCTCT GCCTTGGAGA TTCTCAGTGC TGCGG ATCAX£TCCCT AAGG GGCAGCCTCT 60 
CGCGTCTCCT CCAG ACGCGA GTGCATTCCA TCCTGAAGAA ATCCGTCCAC TCCGTGGCTG 120 
TGATAGGAGC CCCGTTCTCA CAAGGGCAGA AAAGAAAAGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAGAGA AGCTGGCTTG ATGAAAAGGC TCTCCAOTTT GGGCTGCCAC CTAAAAGACT 240 
TIGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATG ATCT CTACAACAAC CTGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTGA GGTGGTTAGC AG AGCTGTGT 360 
CAGATGGCTA CAGCTGTGTC ACACTGGGAG GAG ACCACAG CCTGGCA ATC GGTACCATTA 420 
GTGGCCATGC CCGACACTGC CCAGACCTTT GTGTTGTCTG GGTTGATGCC CATGCTGACA 480 
TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATGG ACAGCCAGTT TCATTTCTCC 540 
TCAGAGAACT ACAGG ATAAG GTACCACAAC TCCCAGGATT TTCCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGTATTGTG TATATTGGTC TGAG AGACGT GGACCCTCCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT OCATGAGAGA TATTGATCGA CTTGGTATCC 720 
AGAAGGTCAT GGAACGAACA TTTGATCTGC TGATTGGCAA GAGACAAAGA CCAATCCATT 780 
TGAGTTTTGA TATTGATGCA TTTGACCCTA CACTGGCTCC AGCCACAGGA ACTCCTGTTG 840 
TCGGGGGACT AACCTATCGA GAAGGCATGT ATATTGCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGGATCTT GTTGAAGTCA ATCCTCAGTT GGCCACCTCA GAGGAAGAGG 960 
CGAAGACTAC AGCTAACCTG GCAGTAGATG TGATTGCTTC AAGCTTTGGT CAGACAAGAG 1020 
AAGGAGGGCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT GAATCAGAAA 1080 
ATCAAGCACG TGTG AG AATT IAQGAGAC AC TGTGCACTGA CATGTTTCAC AAC AGGCATT 1140 
CCAGAATTAT GAGGCATTGA GGGGATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCTTAATGA GAACATTTAC ACATTCTCAC AATTGTAAAG TTTCCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGTTTTT TGCAGTTCAC AGGGTATTAA TATGCTACAG 1320 
TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 
CTIGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTnTCAT CTTTCCTCCC TCCTCCCACA 1440 
GCCTGGCTAT AC AGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA CCCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCGAGCT 1620 
CCAGTAAGAT GATAATGGAA AGCAGCAGCT TGTTGGTTGT CACTCTACAA AGAGAAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGGATAACC TTCCTTCTAA ACATTTGGGG GTTAGACCTG 1740 
GGACCACGGC TGGATACTCT GAGGCTGTAT GTTTGATCAC ACAGCCACTT AGCAGGAAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTACCTCAC AGAAATGTTA 1 860 
AACTGAGACA ATAAAACCCA AAGCAT 



SEQ ID NO:150 PFG2 Protein sequence: 
Protein Accession*: NP.001 163.1 

1 11 21 31 41 51 

MSLRGSLSRL LQTRVI1SILK KSVHSVAVIG APFSQGQKRK GVEHGPAAIR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNUVNPR SVGLANQELA EVVSRAVSDG YSC VTLGGDH 1 20 
SLAIGTISGH ARHCPDLCW WVDAHADINT PLTTSSGNLH GQPVSFLLRE LQDKVPQLPG 180 
FSWIKPCISS AS1VYIGLRD VDPPEHFILK NYDIQYFSMR DIDRLGIQKV MERTFDLUG 240 
KRQRPIHLSF DID AFDPTLA PATGTPVVGG LTYREGMYIA EEIHNTGIXS ALDLVEVNPQ 300 
LATSEEEAKT TANLAVDVIA SSPGQTREGG HIVYDQLPTP SSPDESENQA RVRI 



8EQ(DNQ:151 PFG1 DNA SEQUENCE 

Nudetc Acid Accession*: NMJH7B06 

Coding sequence: 80-1255 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I 1 I I 

AATTATATAT TTTTACTCTA TGTTTCTCTA CATGTTTTTT TCTTTCCGTT GCTGGCGGAA 60 
GAGGCACGTG CGCTGCTG AA TG GAGCTGGT CGCTGGTTGC TACGAGCAGG TCCTCTTTGG 120 
GTTCGCTGTA CACCCGO AGC CCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 180 
TG ACTTC ACT CACCATGCTC ACACTGCCTC CTTGTC AGCA GTAGCTGTAA ATAGTCGTTT 240 
TGTGGTCACT GGGAGCAAAG ATGAAACAAT TCACATTTAT GACATOAAAA AOAAGATTGA 300 
GCATGGGCCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGG AGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AG AAATGGG A 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGACC TTCCTTTCTA TTCACCCATC 480 
TGGCAAGTTG GCCCTGTCGG TTGGTACAG A TAAAACTTTA AGAACGTGGA ATCTTGTAGA 540 
AGGAAGATCA GCATTCATAA AAAATATAAA ACAAAATGCT CACATAGTAG AATGGTCCCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGCATCCATT AGTGGCACCA TCACAAATGA AAAG AGAATT TCC TCTGTT A AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AGAAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTGCGAAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 
TGAAATTCCA GAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGG ATAAGA AAGTTCCCCC ATCTTTACTC TGTOAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGOAG TGTGGCTAGA CAAAOTGGCA GACATGAAAA GCCTTOCTCC 1020 
AGCTGCAGAG CCTrCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TGACACAGTG CACAAAOAAG AAAAGCGGTC AAAACCTAAC ACAAAGAAAC GCGOTTTAAC 1140 
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AGGTGACAGT AAGAAAGCAA CAAAAGAAAG TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1 200 
GGTAGAAATG TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC AGTGAATCAC 1260 
AGATGTCTCC TGAAAGAACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
1 1 1 1 1 1 1 1CCCTGAGTAAAA GCAAGAAATT TCTTCCTTTG GAAAAAATAT ATATATTAA A 1380 
5 AAACCACTTT TAGATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTACTTTTGG 1440 
CAG ACAGTGT TTTATGAATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTT TGTAC A AAGC AAATAAAG AT CTTTCTC AAA AAAAAAAAAA AAAA 



10 

SEQ fD WO:152 PFG1 ProieM sequence: 
Protein Accession #: NP.060376.1 



MELVAGCYEQ VLFGFAVHPE PKACGDHEQW TLVADFTHHA HTASLSAVAV NSRFVVTGSK 60 
DETIHIYDMK KKIEHGALVH HSGTITCLKF YGNRHL1SGA EDGUCIWDA KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAF1 KNIKQNAHIV EWSPRGEQYV 180 
VHQNKIDIY QLDTASISGT 1TNEKR1SSV KJFLSESVLAV AGDEEVIRFF DCDSLVCLCE 240 
20 FKAHENRVKD MFSFEIPEHH VIVSASSDGF IKMWKLKQDK KVPPSLLCEI NTN ARLTCLG 300 

VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGLISTK KRKMVEMLEK KRKKKKIKTM Q 



25 

SEQ ID N0:1$3 PFDS DNA SEQUENCE 

Nucleic Add Accession!: NMJH4668 

Cooing sequence; 1 10-2953 (underlined sequences correspond to start and stop codons) 

30 1 11 21 31 41 51 
I I I I I I 

GATGTCTTGO ACATGCTCTG GCTGGCTAAT CTCCATGTTC TAGCCGACTG AAAATACGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAOT CTAAAGAAAT TTCCT1 1 1G A TGT GGCAOAA 120 
AATCGAGGAT GTGGAGTGGA G ACCCCAG AC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 

35 CCTGATCTTC AGTGGG ATGG ACCCGCATGG GGAGTCCTTG CCGAGGTCTT TGAGGTACTG 240 
TGACCTGCGA TTG ATAAACT CCTCCTGCTT GGTGAGAACA GCCTTGGAGC AGGAGCTGGG 300 
CCTGGCTGCC TACTTTGTGA GCAACGAGGT TCCCTTGGAG AAGGGGGCTA GGAACGAGGC 360 
CTTGGAGAGT GATGCTGAGA AGCTGAGCAG CACAGACAAC GAGGATGAGG AGCTGGGGAC 420 
_ AGAAGGCTCT ACCTCGGAG A AGAGAAGCCC CATGAAAAGG GAGAGGTCCC GCTCCCACGA 480 

40 CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGG AGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTG AGAA ACAGAGGCCC CGGGCAAGTC AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTCAGGAC 720 
CGGCCAGAGG AGCGTCCAGG TGTCGGTCAC CTCGTCGTGC T0CCAGCTGT CCTCCTCCTC 780 

45 GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG AGCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCTTGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCCAAG GCCGCCTCCC TCCTGCCCTC 960 
CCCCTCGGTC ATGTGGGCCA GCTCTTTCCG CCCCCTGCTC AGCAAGACCA TGACATCCAC 1020 

_ CG AGC AGTCC CTCTACTACC GGCAGTGG AC GGTGCCCCGG CCCAGCCACA TGG ACTACGG 1080 

50 CAACCGGGCC GAGGGCCGCG TGGACGGCTT CCACCCCCGC AGGCTGCTGC TCAGCGGCCC 1140 
CCCTCAGATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGGATOCT 1200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGOAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCTGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAG ATGCCA GCCTGATTTG 1380 

55 TTCGCACTAT CAG GOT ATA A AGAGTGAAGA CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CACGG ATGAG ACTGTCCAAG TACGCAGCOT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCCCGCTACC AGCTGTATG A 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCTTA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 

- CTTC ATC ATC CCCAAGTCC A AGG AGCACC A CTTTGTCTTC AGCC AACCTG GAGGCCAGCT 1680 

60 GGAGAGCATG CGACTACCCC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACGAACA TGGGCTCTTT AATCTGTACC ACGCAATGG A 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGCCC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGGAGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 

65 GCAGG AGGAG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTG ATGA 2040 
CTCCTGCGTG ATGTGG AACG TGGTGG ATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGOAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCCC TGCTGGGCCT GCGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CGAGGTGCAA GAGCCCTTCT CCCGCTGCCA CGTGCACAAC TTCATCATCC TGAACGTGG A 2280 

70 CCTGACCCAG AACGTGCAGT ACAACCAG A A CCGGTTCCTG TGTGACGATG TAGACTTCAA 2340 
CCTGCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTGATGAA 2400 
GAAGCAG ATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGO TGTCTGATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGCGCCC GCCCAGCTCC TGCTGOAGAA GTTCCTGCAG CACCACAGCC ACCTCTTCTT 2580 

75 CCCGCTGTCC CTGAAGAACC ATGACCACCC AGTGCTGTCT GTCGACTGTT ACCTGAACCT 2640 
GGGATCTCAG ATTTCTGTTT GCTATGTGAG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GOCTGCTGCT GTACCTCTGT GACTCTTTTG TGGGAGCTAG 2760 
CTTTTTGAAA AAGTTTCATT TTCTGAAAGG TGCGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAGACGG TCGTCCGCCT GGAGCTCGAG GACGAGTGGC AGTTCCGGCT 2880 
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GCGCOATGAG TTCCAGACCG CCAATQCCAO GGAAGACCGG OCGCTCTTTT TTCTOACGGO 2940 
ACGACACATC IQAGGAAGAC AGCGGCGAGT TTTCTGAAGA GATOAGTGCT CAGAGCCCTC 3000 
ATGCTGTTGA GGCTAAAGGG AGGCCTGGAA CGGTGGGGCG TTTGACTGGA ATGGACCCCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCCGAGGCCG TGGTCCTGGG 3120 
AGCCAGG AAG ACTCCGCAGT GGGTOAGAAT GAAAACTTGA GACTCCCAAG TTCTGGGCCA 3180 
GCCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACGAGG AACAAAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGAC AAAC CTGATTTTTT TCTCTTAGTT CTAAAG AATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG GATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAACC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTAA AGGCTGG AGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCAAATTT AATA AAGTTA AACAAATTGA TTTACTTC AG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA G AAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGGTTTTTT GGGGGGGGAG TTGGCGGGGA 3660 
GGAAATAAGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTG AAACAGACAA TGAAAACA AC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGGAAG AATCTCACTT GACTAGAGAG GAGGTGGGAA 3900 
CAGAAGAGAG AAGGAGGCAG GG AGATGTAT TTCTTAGGGC TCACCCCITC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGTTTTGTTT TGTTTTGTTT TTGAGATGGA CTCTAGCTCT 4020 
GTCACCCAGG CTGG AGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG CCCACCACCA 4140 
CGCCCGGCTA ATTTTTTGTA TTTTTTAGTA GAGACGGGGT TTCACCATGT TAGCCAGGAT 4200 
GGTCTCGATC TCCTGACCTC GTGATCCGCC CGCCTCGGCC TCCCAAAGTO CTGGGATTAC 4260 
AGGCGTGAGC CACCGTGCCT GCGCCAGAAT GG1 1111 AAA GCCACAGTTG AGAGGCCACC 4320 
CATTGCCCGG CGCCTGGACA GTGATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCCXT TTG AAAGATG AGAAGGTTGA GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTCCCTTC TCCCACTTGC CTACCCTCAA TGCCACACTG TTTTTGAAGT 4560 
GGCCCATAAC TTGAAGGAAA AGTTTAAAGA CAGTTCAATT TAATCATCAG AATGCATTCT 4620 
lUlTlTl 1C GGAGACGGAG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTCCAG CCTCAGCCTC 4740 
CCGAGTAGCT GGGATTATGG GCGCCCACC A CCATGCCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTGATCTG CCCACCTCAT CCTCCAAAAG TGCTGGGATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAGACAT TTATAAGCAC TCTAAT GGAT 4 980 
AACAATCCAA GAATAAATGA TTGTAAAAGA TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTCCG CGAGTATTAA AT ATTTAG AT C AATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGG A ATAAAAACTG AC C1 111 11 A ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTTGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID MO:154 PFD6 Protein sequence: 
Protein Accession!: NP.05548X1 

1 11 21 31 41 51 
I I I I I I 

MWQKIEDVEW RPQTYLELEG LPCILIFSGM DPHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAE KLSSTDNEDE ELGTEGSTSE KRSPMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTAL PQGEHARSPQ PRGPAEEGRA PGEKQRPRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QA5QCSLTKA CRQPPIVFLP KLVYDMV VST DSSGLPKAAS LLPSPSVMWA SSFRP1XSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGFHPRRLL LSGPPQIGKT GAYLQFLSVL 360 
SRMLVRLTEV DVYDEEEINI NLREESDWHY LQLSDPWPDL ELFKKLPFDY HHDPKYEDA 420 
SUCSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHGEQC HQYMGFHPRY 480 
QLYESTLHAF AFS YSMLGEE IQLHFUPKS KEHHFVFSQP GGQLESMRLP LVTDKSHEYI 540 
KSPTFTPTTG RHEHGLFNLY HAMDGASHLH VLWKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFLI KELS YHNLEL ERNRQEELGI KPQDIWPFIV ISDDSCVMWN WDVNSAGER 660 
SREFSWSERN VSLKHIMQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFH 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF S VMKKQIWG GHRSFHTTSK 780 
VSDNSAAWP AQYICAPDSK HTFLAAPAQL LLEKFLQHHS HLFFPLSLKN HDHPVLSVDC 840 
YLNLGSQIS V CYVSSRPHSL NISCSDLLFS GLLLYLCDSF VGASFLKKFH FLKGATLCVI 900 
CQDRSSLRQT WRLELEDEW QFRLRDEFQT ANAREDRPLF FLTGRHI 



SEQ 10 NO:155 PFC8 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000522 

Cooing sequence: 1-1 167 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

AIQACAGOCT CCGTGCTCCT CCACCCCCGC TGGATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTGGT GGCCGACGAG CTCAACAAGA ACATGGAAGG GGCGGCGGCG 120 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TCGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACC AGTG CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCC AGO A 300 
GCCGCGTCCG OCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCOCTGCT 360 
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GCCGCGGCTO CCGCTGCAGC CGCCGCCGCC GCCGCCGCOTCGTCCTCGGG AGGTCCCGGC 420 
CCGGCGGGCC CGGCGGCGGC AGAGGCGGCC AAGCAATGCA GCCCCTGCTC GGCAGCGGCG 480 
CAG AGCTCGT CGGGGCCCGC GGCGCTGCCC TATGGCTACT TCGGCAGCGO CTACTACCCG 540 
TGCGCCCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC CCCCTCGGCC 600 
GCCGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCG CCGGCCCAGC TGCOG AGGAO 660 
TICAGCTCCC GCGCTA AGG A GTTCGCGTTC TAOCACCAGG GCTACGCAGC CGGGCCTTAC 720 
CACCACCATC AGCCCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
CCCGGCGAGT CGCGCCACGA ACCCTTGGGT CTTCCCATGG AAAGCTACCA GCOCTGGGCG 840 
CTGCCCAACG GCTGG AACGG CC AAATGTAC TOCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 
CTCTGGAAGT CC ACTCTGCC CG ACGTGGTC TCCCATCCCT CGGATGCCAG CTCCTATAGG 960 
AGGGGGAGAA AGAAGCGCGT GCCTTATACC AAGGTGCAAT TAAAAGAACT TGAACGGG AA 1020 
TACGCCAOGA ATAAATTCAT TACTAAGGAC A AACGGAGGC GGATATCAGC CACGACGAAT 1080 
CTCTCTGAGC GGCAGGTCAC AATCTGGTTC CAGAACAGGA GGGTTAAAGA GAAAAAAGTC 1 140 
ATCAACAAAC TGAAAACCAC TAGTJAA. 



8EQ (D H0:156 PFC6 Protein sequence: 
Protein Accession #: NP.000513.1 

20 1 11 21 31 41 31 

MTASVLLHPR WIEPTVMFLY DNGGGLVADE LNKNMEGAAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNF SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSSAPG EAPPS AAAAA 120 
„ AAAAAAAAAA AAASSSGGPQ PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYPGSGYYP 180 
25 CARMGPPPNA IKSCPQPPSA AAAAAFADKY MDTAGPAAEE ESSRAXEFAF YHQGYAAGPY 240 
HHHQPMPG YL DMPWPGLGG PGESRHEPLG LPMESYQPWA LPNGWNGQMY CPKEQAQPPH 300 
LWKSTLPDW S HPS DAS SYR RGRKKRVPYT KVQLKELERE YATNKF1TKD KRRRISATTN 360 
LSERQVTIWF QNRRVKEKKV INKUCTTS 

30 

SEQ ID N0:157 PFA3 DNA SEQUENCE 

Nucleic Add Accession*: AW102723 
^ Coding sequence: 523-2676 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
CCTAGTGGGT GGG ACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

40 TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGOCGGGCG TGATCTCACC 240 
ATGTGCGG AT TTGCG AGGCG CGCCCTGG AG CTGCTAGAG A TCCGG AAGCA CAGCCCCGAG 300 
GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACOC TGTCGCCTG A GCTGCCTG AC AGTG ACAATG ACATCCCAGT 420 

45 TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATQTTCTG CACGAAGCTC 540 
AAGG ATCTCA AG ATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

50 AGCCGAGTCT ATCTTCAC AC TTTGGCAG AG AGTATTTGCA AACTGATTTT COCAGAGTTT 780 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 
AAATCTTTGG AAAG AG AAGACTTTG AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAG AGT 900 
CCAGTGG AGT TATCAAAGAA TCTCTTCGTG AAGAGGTTTT TAAAATATGT TACGAGG AAO 960 
ATG AAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

55 CCTTCTG AAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 
TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 
ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAOTTTGTG 1260 

_ AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

60 AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAG AC ATTTCCATTC 1380 
CATTTCATGT TTG ACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTO 1440 
ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTGAGGAGAT GGG ACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

65 ATGATCTACA TTGTTGAATC CAGTGCAATC TTG 1 1 1 1 1GG GGTCACCCTG TGTGGACAGA 1680 
TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGO GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 
GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAG AAAAAG 1860 
ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

70 CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGCC A TCTGCTCCC A GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTCGCT TCGACGAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGG ATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 
GCGCTGATGG CCCTGAAGAT G ATGG AGCTC TCTG ATGAAG TTATGTCTCC CCATGGAGAA 2220 

75 CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC A ATGTCACTC TOGCTAACAA ATTTGAGTCC 2340 
TGCAGTGTAC CACGAAAAAT C AATGTCAGC CCAAC AACTT ACAG ATTACT CAAAG ACTGT 2400 
CCTGGTTTCG TGTTTACCCC TCG ATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTG AA 2460 
ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 
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15 



TTCCAAAAGA AAGATOTGOA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 
TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 
CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
5 TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 
AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 
AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 



30 



35 



S£Q |p WW ?m Prgt^n sequence; 

Protein Accession #: NP.000847.1 



1 U 21 31 41 51 
I I I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 
„ QRKTSRSRVY LHTLAESICK LIFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKT1AE 120 
20 QAVQQSPVEL SKNLLVKRFL KYVTRKMKTS LGWLEAPLKI FKQLQYPSET EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS ULPGHKAA AHVLYHIEVE VSLMPPCFHN 240 
DCSEFVNQPY LLYSVHMKST KPSLSPSKPQ SSLVIPTSLF CKTFPFHFMF DKDMTILQFG 300 
NGIRRLMNRR DFQGKPNFEY FHLTPKINQ TFSGIMTMLN MQFWRVRRW DNSVKKSSRV 360 
_ MDLKGQMIYI VESSAILFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDVV LIGEQARAQD 420 

25 GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCS IFPCEV A QQLWQGQVYQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVETIAMPIV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV M SPHGEPIKM RIGLHSGSVF AGWGVKMPR YCLFGNNVTL 600 
ANKFESCS VP RKINVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCFQKK DVEDASQFFR QSERNRLATY IPIYKSLGFD SLKMCRASES TLGIVDG 



SEQID NO:159 PFA1 DNA SEQUENCE 

Nudete Add Accession* NM.004362 

Coding sequence: 102-1934 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 
I I I I I I 

CGCCGGCGGG ACTGGTCTGA AGAGACGCGG GGACAAAGTG GCAACGACTT GGACATCTGA 60 
Ar . GCTGTCACTG CCG AAAACAG GCCGCAAGAG AGATAATCAA TATGCATTTC CAAGCCTTTT 120 
40 GGCTATGTTT GGGTCTTCTG TTCATCTCAA TTAATGCAGA ATTTATGGAT GATGATGTTG 180 

AG ACGGAAGA CTTTGAAGAA AATTCAGAAG AAATTG ATGT TAATGAAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAG ACA CCTCAACCTA TAGGAGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGGAAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA GAAAGATGAC ATGGATGAGG 360 
Ae AAATTTCAAT ATACGATGGA AGATGGGAAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 
45 GTGACAGAGG ACTGGTATTA AAATCTAGAG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT GATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTGA TTGTGGAGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTGATTC 600 
TGGAAAACTT TTATGATAAA ACATCCTATA TCATTATGTT TGGACC AG AT AAATGTGG AG 660 
„ AAGATTATAA ACTTCATTTT ATCTTCAGAC ATAAACATCC CAAAACTGGA GTTTTCGAAG 720 
50 AG AAACATGC CAAACCTCCA G ATGTAG ACC TTAAAAAGTT CTTTACAGAC AGGAAG ACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTGA GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGO AAGCCTC CTAGAGG ATG TGGTTCCTCC TATCAAACCT CCCAAAG AAA 900 
TTGAAGATCC CAATGATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
„ CTTCTGCCGT CAAACCAGAA G ACTGGG ATG AAAGTG AACC TGCCCAAATA G AAGATTCAA 1020 
55 GTGTTGTTAA ACCTGCTGGC TGGCTTG ATG ATGAACCAAA ATTTATCCCT G ATCCTAATG 1080 

CTG AAAAACC TGATG ACTGG AATG AAG ACA CGGATGG AG A ATGGG AGGCA CCTCAGATTC 1140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGGAAACC TCCCATGATA GATAACCCAA 1200 
AATACAAAGG AGTATGG AG A CCTCCACTGG TCG ATAATCC TAACTATCAG GG AATCTGG A 1260 
GTCCTCGAAA AATTCCTAAT CCAGATTATT TCGAAGATGA TCATCCATTT CTTCTGACTT 1320 
00 CTTTCAGTGC TCTTGGTTTA GAGCTTTGGT CTATG ACCTC TGATATCTAC TTTGATAATT 1380 

TTATTATCTG TTCGGAAAAG GAAGTAGCAG ATCACTGCGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG G7GTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGCCA ATAGCATTAA 1560 
„ TTACTTCATT TTGTTGGCCA AGAAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 
65 AAACCGACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGG AAGAGA 1680 
AAGCAGCCCT GGAAAAACCA ATGGACCTGG AAGAGG AAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAG AAGG GC A AGAAGAAAGT AATCA ATCAA ATAAGTCTGG GTCAGAGGAT GAGATGAAAG 1860 
AAGCAGATGA GAGCACAGGA TCTGOAGATO GGCCGATAAA GTCAGTACGC AAAAGAAGAG 1920 
70 TACGAAAGGA CI&AACTAGA TTGAAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCC AO ACC TG AACTTTA A TC AGTCTGC A CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTG AAGT TACCTCATCT TTG AATTTAG AATAAAAGTG GCACATTACA TATCG GATCT 2160 
__. AAGAGATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGGAGATAG TTTTGGTTTG 2220 
75 TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGO AAAAATCAGT TATTOGAATT 2280 
TCCACTTAAA TGGCTATACA AC AATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAG AGCTAAA TGC AATAAAG TTTCTGTATG GTTGTTTG AT TCTATCAACA 2400 
ATTG AAAGTG TTGTATATGA CCCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTATA GATTCCTTTA AGGACATGCC TTGTTCATAA AATCACTGGA 2520 

363 



WO 02/30268 



PC17US01/32045 



TTATATTGCA GCATATTTTA CATTTGAATA CAAGGATAAT GGGTTTTATC AAAACAAAAT 2580 
OATGTACAGA TTTTTTTTCA AGTTnTATA GTTGCTTTAT GCCAGAGTGG TITACCCCAT 2640 
TCACAAAATT TCTTATGCAT ACATTGCTAT TGAAAATAAA ATTTAAATAT TTTTTCATCC 2700 
^ TGAAAAAAAA 

seq id KQ:160PFA1 prctetowqumw 

Protein Accession*: NP.004353.1 

10 1 II 21 31 41 51 
I I I I I I 

MHPQAFWLCL GLLF1SINAE FMDDDVETED FEENSEEJDV NESELSSEIK YKTPQPIGEV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGR WHEEL KENQVPGDRG LVLKSRAKHH 120 
i _ AISAVLAKPF EFADKPLIVQ YEVNFQDGID CGGAYIKLLA DTDDULENF YDKTSYUMF 180 
15 GPDKCGEDYK LHFIFRHKHP KTG VFEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFE 240 
VLVDQTVVNK GSLLEDWPP IKPPKEIEDP NDKKPEEWDE RAKIPDPSAV KPEDWDESEP 300 
AQEDSSVVK PAGWLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRIGCGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
_ rt DIYFDNFIIC SEKEVADHWA ADGWRWKIMI ANANKPGVLK QLMAAAEGHP WLWUYLVTA 480 
20 G VPIAUTSF CWPRKVKKKH KDTEYKKTDI CIPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEIIEOQ EESNQSNKSG SEDEMKEADE STGSGDGPIK 600 
SVRKRRVRKD 

25 SEQ ID N0:161 PEZ9 DNA SEQUENCE 

Nucleic Acid Accession*: NMJJ05932 

Coding sequence: 75-2216 (underlined sequences correspond to start and stop codons) 
„ l 11 21 31 41 31 

30 I i | I I | 

GCGGAGCGCG CGCTCCCAGC GAAAGCAGCA GGGCAGGGAT CTGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAGAATGCTG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTCGAAGC CGGGATCCGG GCCCGAAGGG 180 
TCAGCACCAG CTGGTCTCCC GTGGGCGCCG CCTTCAATGT CAAGCCCCAG GGCAGCCGCT 240 
35 TGG ACCTGTT CGGCGAGCGG GCGCGTCTTT TTGGAGTTCC TGAGCTGAGT GCCCCAGAAG 300 
GATTTCATAT TGCACAAG AA AAAGCCTTGA GAAAG ACAG A ATTGCTTGTG G ACCGTGCAT 360 
GTrCCACCCC ACCTGGGCCC CAGACCGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAG AGTGGC CGACTTGGCT GATTTTGTGA AAATCGCTCA CCCTGAGCCA GCATTCAGAG 480 
AAGCTGCGGA AGAAGCTTGT AGAAGTATTG GCACCATGGT AGAGAAGTTG AACACAAATG 540 
40 TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTGATAA AAAACTTGTG GATTCCCTTG 600 
ATCCAGAAAC AAGGCGAGTG GCTGAACTGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAGAGCAG TGGACCTCAA TGTTAAAATC TTGGATTTGA 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC CCAACAAG AT TGAGAAGCAT CTCTTACCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GGTCTCCACG 840 
45 CAG AATC ACC AG ATG ACTTG GTGCG AG AAG CTGCTTATAA AATTTTTCTT TATCCCAATG 900 
CTGGTCAATT GAAATGTTTA G AAGAATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TCCAAGGAAC G ATAGCTAAA AATCCAGAGA 1020 
CTGTCATGCA GTTCCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 
TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTCCGAA GTAATGCCCT 1 140 
50 GGGACCCCCC TTACTACAGT GGTGTGATTC GTGCAGAAAG GTATAATATT GAGCXXAGCC 1200 
TATATTGCCC GTTTTTCTCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACTGTTGGG GATTTCATTA TATGCAQAGC AGCCTGCAAA AGGAGAGGTG TGGAGCGAAG 1320 
ATGTCCGAAA ACTGGCTGTT GTTCATG AAT CTGAAGGATT GTTGGGGTAC ATTTACTGTG 1380 
„ ATTTTTTTCA GCGAGCAGAC AAACCACATC AGGATTGCCA TTTCACTATC CGTGGAGGCA 1440 
55 GACTAAAGGA AG ATGGAGAC TATCAACTCC CACTTGTAGT TCTTATGCTG AATCTTCCCC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTCCTGGCAT GATGGAAAAT CTTTTCCATG 1560 
AAATGGGACA TGCCATGCAT TCAATGCTAG GACGTACTCG TTACCAACAC GTCACTGGGA 1620 
CCAGGTGCCC TACTGATTTT GCTGAGGTTC CTTCTATTCT G ATGGAGTAC TTTGCAAATG 1680 
, rt ATTATCGAGT AGTTAACCAA TTTGCCAG AC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 
60 ATATGGTGTC TCGTCTTTGTGAATCTAAAA AGGTTTGTGC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCI ITT A TGCCACTCTG GATCAAATCT ACCATGGGAA GCATCCCCTG AGGAATTCAA 1860 
CCACAGACAT TCTCA AGGAA ACACAAGAGA AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
, _ TCATGTCCAG AGCGGTCGCC TCC ATGGTTT GGAAGGAGTG TTTTCTACAG GATCCTTTCA 2040 
65 ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACGGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATG AC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGGACTTCG AAACTTTCCT CATGGATTCT GAAIAAAAGA 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATGACT TTGTTATAAA TGCTACAGCT 2280 
_ _ OTGAGAGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 
70 TGGTAGAACT TGG AATAAAT AATTTGTTTT AATTAAAAA A AAAAAAA AAA AA 



75 



SEQ ID H0:162 PEZ9 Protein seouence: 
Protein Accession*: NP.005923.1 

1 11 21 31 41 51 

MLCVGRLGGL GARAAALPPR RAGRGSLEAO IRARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFGVPE LSAPEGFHIA QEKALRKTEL LVDRACSTPP GPQTVUFDE LSDSLCRVAD 120 
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LADFVKIAHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF EISGIHLDKQ KRKR AVDLNV KILDLSSTFL MGTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHU IDGLHAE5PD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKJLSB RTUCDFEMIR GMKMKLNAQN SEVMPWDPPY 360 
YSGVKAERY NIEPSLYCPF FSLGACMEGL NILLNRULGI SLYAEQPAKG EVWSEDVRKL 420 
AVVHESEGLL GYIYCDFFQR ADKPHQDCHF TIRGGRLKED GDYQLPLVVL MLNLPRSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFAEVPSILM EYFANDYRVV 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDCL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYSYLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVSALVSD LDLDFETFLM USE 



SEQ 10 NO:163 PEZB DNA SEQUENCE 

Nucleic Add Accession*: AF103807 

Coding sequence: none (underlined sequences correspond to start and stop codorts) 



1 11 21 3! 41 51 
I I I I I I 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTGGG AAGGACCTGA TGATACAGAG 120 
GAATTACAAC ACATATACTT AGTGTTTCAA TGAACACCAA GATAAATAAG TGAAGAGCTA 180 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGAOGGC ACTTTCTG AG 240 
TACTCAGTGC AGCAAAG AAA GACTACAGAC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GGCTGCTGAC TTTACCATCT G AGGCCACAC ATCTGCTGAA ATGGAGATAA TTAACATCAC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTGAC ATGTTTTTGC ACATTTCCAG 420 
CCCCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAGAT CCCTGGGAGA 480 
AATGCCCGGC CGCCATCITG GGTCATCGAT GAGCCTCGCC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAG AAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG GATATTTATT TGAACGGGATTACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTACC AAT G AG AGGAAAA CAGACG AGAA AATCTTG ATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAACCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG TAATATCTGA TCTCTACGGT TCCTTCTGGG 900 
CCCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAGATCTGTA 960 
CTGTGACCTT TCTACACTGT AGAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTGA CTGTTTTTCC TAAGGAGTGT TCTGGCCCAG GGGATCTGTG 1080 
AACAGGCTGG GAAGCATCTC AAGATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATGA 1140 
TCATTACGG A GTGAATTATC TA ATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCCCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAACTT TTTTTTTTAA CCTGGA AG AA TTCAATGTTA CATGCAGCTA TGGG AATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC CCCTTTGTTT 1380 
GATTTTTTTT CCAGTATAAA GTTAAAATGC TTAGCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAAC AAAATC TAACTTGTAA TTCCTTGAAC ATGTCAGG AC ATACATTATT CCTTCTGCCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAGAAG GGACACATAT GAGATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAGAGTTTA GATAAATATA TGAAATGCAA GAGCCACAGA 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAG AATTC ATGCAGTGCA AATCCCCAAA GGTAACCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTTAGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTGAAAG AAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAGAAT TTACAAAGAG CTACTCAGGA CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CITGACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TGAGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTC ATTATTC TCCAGTAAAT GTO ATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTC AC AAAAG CAGCTGG AAA TGG AC AACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAGA AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAG GTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTG AATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCCCTCTTTG 2580 
TGTTCATGGA TAGTCCAATA AATAATGTTA TCTTTGAACT GATGCTCATA GGAGAGAATA 2640 
TAAG AACTCT GAGTG ATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GG AACCA AG A TACAAAGAAC TCTG AGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGGAC CCAACGCATG TCTGAGATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGG ATGC TAGCTTCTGG CCATCTCTGG CTCTCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTGCTTCC ACGACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAGAATTT CAACG ACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC C TGAA TGCCT AG ACCCTTAT TTTTATTAAT 3060 
TTCCAATAGA TGCTGCCTAT GGGCTATATT GCTTTAGATG AACATTAGAT ATTTAAAGCT 3120 
CAAGAGGTTC AAAATCCAAC TCATTATCTT CTCTTTCTTT CACCTCCCTG CTCCTCTCCC 3180 
TATATTACTG ATTGCACTGA AC AGCATGGT CCCCAATGTA GCC ATGC AAA TGAG AAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAGA CTGCTGAAGC CAGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GGAGGGGACC ACTCCTGGGC CTTCGTGATT GTCAGGAGCA AGACCTGAGA 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATGAAGATCC ATAGAATTTG 3420 
CTACATTTGA GAATTCCAAT TAGG AACTCA CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
ACTTGCTGAA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCAAC TTTG ATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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AA AGTGG CTT TTATTCTCTT TATTATTATT ATTTTCTTTT ACTACTATAT TACGTTGTTA 3660 
TTATnTGTT CTCTAT ACTA TCAATTTATT TGATTTAGTT TCAATTTATT TTTATTGCTO 3720 
ACTTTTAAAA TAAGTOATTC GGGOGGTGGG AGAACAGGGG AGGGAGAGCA TTAGGACAAA 3780 
TACCTAATGC ATGTGGGACT TAAAACCTAG ATGATGGGTT GATAGGTGCA GCAAACCACT 3840 
ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATGTATC CCAGAACGTA 3900 
AAGTAAAATT TAAAAAAAAG TGA 



PE28 Protein sequence: 
1 0 Protein Accession #: none 



15 



SEQ ID KO:164 PEZ6 DMA SEQUENCE 

Nucleic Acid Accession #; AB02894S 

Coding sequence: 1-3765 (underlined sequences correspond to start and stop codons) 



1 II 21 31 41 51 
I I I I I I 

AX2ATGATGA ACGTCCCCGG CGGAGGAGCG GCCGCGGTGA TGATGACGGG CTACAATAAT 60 
. n GGTOGCTGTC CCCGGAATTC TCTCTACAGT GACTGCATTA TTGAGGAGAA GACGGTGGTC 120 
20 CTGCAGAAAA AAG ACAATGA GGGCTTTGGA TTCGTGCTTC GAGGGGCCAA AGCTGACACA 180 
CCCATTGAAG AATTCACACC AACAOCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGGACCG GGGACTTCTT GATTGAGGTT 300 
AACAATGAGA ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT CCGGCAGGGA 360 
_ r GGGAATCACC TGGTCCTTAA GGTGGTCACG GTGACCAGGA ATCTGGACCC CGACGACACC 420 
25 GCCAGG AAG A AAGCTCCCCC GCCTCCAAAG CGGGCACCX3 A CCACAGCCCT CACCCTGCGC 480 
TCCAAGTCCA TGACCTCGGA GCTGGAGGAG CTCGTGGATA AAGATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT G AGAACATGG CTGTGGAACC GAGGGTGGCG 600 
ACCATCAAGC AGCXKJCCCAG CAGGCGGTGC TTCCCGGCGG GCTCAGACAT GAACTCTOTG 660 
nn TACGAACGCC AAGGAATCGC CGTGATGACG CCCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 
30 TTTCTGGGCA TCCCTCG AGG TACG ATGCG A AGGCAGAAAT CAATAG ACAG CAGAATCTTT 780 
CTATCAGGAA TAACAGAGGA AGAGCGGCAG TTTCTGGCTC CTCCAATGCT GAAGTTCACC 840 
AG AAGCCTGT CCATGCCGG A CACCTCTG AG GAC ATCCCCC CTCCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACAACTGCC CCAAGTCCCC AACTCCAAGA 960 
GTCTACGGGA CGATTAAGCC TGCGTTCAAT CAGAATTCTG CCGCCAAGGT GTCCCCCGCC 1020 
35 ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGGACCGCT ACTCCTTGGA CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1 140 
AACTTCCGCA ACAAGAGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT CCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1260 
. . CAGTCCAACG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATOOC GACCATCATC 1320 
40 GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGCC AGGGCAGCAG CATGGAGATC 1380 
GACCCCCAGG CCCCGGAGCC ACCGAGCCAG CTGCGGCCTG ACGAAAGCCT GACCGTCAGC 1440 
AGCCCCTTTG CCGCCGCCAT GGCCGGAGCC GTCCGCGACC GTGAGAAGOG GCTGGAAGCC 1500 
AGG AGGAACT CCCCGGCCTT CCICTCCACA GACCTGGGGG ATGAGGATGT GGGCCTGGGG 1560 
CCACCCGCCC CCAGGACGCG GCCCTCCATG TTCCCCGAGG AGGGGGATTT TGCTGACGAG 1620 
45 GACAGCGCTG AGCAGCTGTC ATCCOCCATG CCGAGTGCCA CGCCCAGGGA GCCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCGA GGCCAGTGCT COGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG CCCAGGGGCC OGAGAGCAGC CCAGCAGTGC OCTOCGOGAG CAGCGGCACA 1800 
GCCGGCCCCG GG AATTATGT CCACCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1860 
_ rt CTGGCCCTGG CACTCTCCGC AAGGGACCGA GCCATGAAGG AGTCTCAACA GGGACCCAAA 1920 
50 GGGGAGGCCC CCAAGGCCGA CCTCAACAAA CCTCTTTACA TTGATACCAA AATGCGGCCC 1980 
AGCCTGGATG CCGGCTTCCC TACGGTCAGC AGGCAGAACA CCCGGGGAOC CCTGAGGCGG 2040 
GAGGAGACGG AGAACAAGTA CGAGACCGAC CTGGGCCGAG ACCGGAAAGG CGATGACAAG 2100 
AAGAACATGC TGATCGACAT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
_ - GTGCACACCG TGGACGCCAC TAAGCTGGAC AACGCCCTGC AGGAAGAGGA CGAGAAGGCA 2220 
55 GAGGTGG AGA TGAAGGCAGA CAGCTCCCCG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
G AAGGTGCTT TACAGATCIC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGG A AGAGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TGAGGATTTT ATM TT ACAG AGCCATTGCC TCCTCCCCTG 2460 
GAATTTGCAA ATAGTTTTGA TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 
60 GACTTAGTGA AGCAGAAGAA AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGAAGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTGAAAG CTTTGACGCC GTCGCCGACT CTGGGATCGA GGAGGTGGAC 2700 
AGCCGGAGTA GCAGCGACCA CCACCTCGAG ACGACCAGCA CTATCTCCAC CGTGTCTAGC 2760 
ATCTCCACCC TGTCTTCCGA AGGTGGAG AO AATGTGGACA CCTGCACAGT CTATGCAGAT 2820 
65 GGGCAAGCAT TTATGGTTGA CAAACCCCCA GTACCTCCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACA AA A GCAATGCACT TTATCA AGAC GCGCTCGTGG AAGAAGATGT AGATAGCTTT 2940 
GTTATCCCCC CGCCCGCTCC CCCGCCCCCG CCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGACCTC CAAGTTGTGG GGCGACGTCA CAGAGATCAA AAGCCCGATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 
70 CO AG AG AA AT TGGCAAAGCC GGGGGAAGGA CTGG ATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATGAAA GCAGGACCTC AGGAACAAGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAGAGACCCT GCCCGCCCCC CTGTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 
7 5 GCTCTCTCAG ATGTCTTTAG CCTTCCAAGC CAGCCCCCTT CTGGGGATCT ATTTGGCTTG 3480 

AACCCAGCGG GACGCAGTAG GTCGCCATCC CCCTCGATAC TGCA ACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAGATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTG A ACATAAAGAG GCCTTCATGG ACAATGAGAT CGATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGOACCTC ATCOATCTTO GGGTAACTCG AGTCGGGCAC 3720 
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AGAATCAACA TAGAAAGGGC TTTOA AACAG CTGCTGGACA QATAAGQACG GCTGCTCTCC 3780 
ACCTCGCAG A CTGCTCTTGT TATAAGTAGA GATGGGCTCG TGCTG AAACA TCTG AATGCC 3840 
AAGCGAAGTC TGTGAGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAGAAA 3900 
TACTGAGTTG TGTCCACAAC ATGGCTGGGT CTTCAOACCC CTGGCTCACC ATGTGGGTGT 3960 
CTTGGGCAGT TTCTATGACA CATGGGACAA GGGGAGGGAG TTTTTCTAAC ATGGAAAAAO 4020 
ATTCCCAGCC TGCCGCCCAG CATGCAGGTG GCCTCGCTTT GCCGGOTCCG AGAGGCTGCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
GCTGAGACCT OCQTCCTCTG CTTTCCGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCnrCCTC AGTCCTGTGG CCTCTCAGAG GACACCTGAT GCTCACCTGC CCCTCTTTCT 4260 
CCTGCACTTG GCTTGCAGTG AGATGCTCCC AGATGCATTT GTCCAGTGCC CCATCATGGG 4320 
CCTGAAAGGC AG AGAAACTT TTTCCTACAC AGATTCTTTT CCCXATCTCC TCCTGTGGTT 4380 
TGCATCCATG GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATCG 4440 
TGCCCAGCTT TGCTTAGCTT TCTTTATTTC TGCAAATCTG TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACOATTCCAC AATGGAGGGG 4560 
AGACCTGGCC AAGGGAGCCA GCC AGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATG AG CACCCATCCA GGATGGAGAA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTGAAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT GAAGG ACAAG AAGACGCATG GCTCATGGCG GGCACATGOG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGG A GCTGGGCGGG CACAGCATCC TCAGTTCTGT GCCCAGCCAA 4860 
GGGTG AGCAT CTCTGCTGAG ACAGTCCTTT TGCTCTCGGA GGCCAGGG AA GATGGTACTT 4920 
AGAGGCTTTT COCCTATCGC TCTGGGTGTC TAGGAATCCC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGACC CAGTGGGTAT GGAGTATAGA CAGAACCCAG GGTTGAGAAC 5040 
AGAAGGTGGG CGGCAGGATC AGAGTGAAAG CAGAGGCGTG AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGCCAGGTCA GCCTCTCTGG CAAGGCTTTC TTGAGCCCCG CXXXriTTCTT 5160 
TCCCCGGAGT CCCTCCACCC CATAACAATA CCTCG AATTT CCAAAAG AGG TCACCAG ATG 5220 
CACATGGGCC GCAAAACACA CAGTCAGGCT TCCAGCACAT TCTCCCCCAT TTGGAGGATA 5280 
CTCGAATGTC AGG TI ' ITIG O TTTTATTATT ATTTCAGA AC TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT GGTTTTGTTT TTTTTTTTTC CTTTTTTTCT TGATTAGGTC TGGAACAGCT 5400 
CTAGAATGAA CACATAAAAT TTAGCAATTT AAAATCTTTC TTTACTGCAA GTTTAAATAG 5460 
TTGTACAGAT AGTTTATAAG C ACAATATTT TAAG AAAAAA AAGTGGCTGG TCTACTAGGC 5520 
AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAG AAAAA AAAACTTTTG TGATTTAATA 5580 
ATACTATTTC TGTGGAATAA TTATAAAAGT ATGACCTTTT TAAATCAACC TTATTTGGAT 5640 
GCATCTGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT CGTCATTGAA 5700 
GGACAATTTC TTCAAAGTGG TTACAATTCA TAATGCAGCA GTTTCTCCAA AAACAAAAAC 5760 
AAAACACACA CCACACACAC GCGCTTTTCC AGTCACACAC CCCTGATGTT GGAACCAAGT 5820 
rnTGGACCT TCTGTTCCAA AACCTTTTGC AGGTCAATCT TTGTATTTGA AAT GATCCA A 5880 
TCCAACTTGA AGTCA ATTGA ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATGAGAT GAATGAGCAT TACTCTAGAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCTG AGGCT GCCCCATATT TTAGAAAATG TAAAAATGGT GGTTTGGCC A TTAATTTGTC 6060 
TTCCATTTGA TGATACCGCA AAATTCCGTO AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAACC AGTCGAAACT CGTGACTTCG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTO A GGGCCAGATG TTATTCCCTT TCTTAAAGAT ACTCCAAGCC 6300 
ACATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA G AGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA GAG AAGTAAT AGACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCT GCTT ATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATCACCACC CACTGTTCTC CCCCTTTGGG ACATGTTAGG 6540 
ACGAGGCCCT ATTCCATGCC CCTCTTTAAT GGTGGAACAA ATGTTAAACT GCTCATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTGT ATAGAAAACC CAAAT CTCTC 6720 
AAAATGTAAA TTATGTATAC CTGCCAAGAT ACCTTTTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TGATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTCAGT 6900 
TATTGAACAA GCAAGCATTA TCCAGTTGAT CTGGCAATGA CTTTTTGTGT GTGGGCCACA 6960 
ATATTGATTT TCCCATTAAC AATTl'nTl '1' TGTTTTTTAA ATACTAA TAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TGTTCGCATT ATCTATGTTG CTGTTACTTT TGTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
AGGTTGTGGT TCTG ATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTG ACCG CTGCTATAGG CGTGGAGCTG AGGCTCGGCT 7500 
TTTCCTTTTG TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGGACCCT 7560 
TCCCTTAGGG TCCAGTCTCC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
GAGTGGCAGA ACTGGGCCGC CTCTCTGGTT GACAAGCAAA CCACATGCTA AGGCTTGO AG 7680 
CAAG AGAGAA TTTGTGTCTA TTGGCAAAGA ACTAAGCCAG GAAGACATGG GCCATCCCTC 7740 
CGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTG AACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAACTTGTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTG AGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGG A AA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
CCCTCAAGCr CTCCCGCTTC ACCATCCAAT AGTTTCTCCC AAACCTTGGC ACCOCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAG ACCA CTTTTCCTAG ATGAATATAT TCGTT TACCT 8040 
TACTAGGAAA ATTATTGOAA GATTTTTTCT TTTACTTGAA ATTGGAGGCA TTTTAATAAC 8100 
TGGCGAACTG GAATGTGTTT CTGTATTTGT AGACAACCAT GTACCCATGC AAGTAG GTGA 8160 
ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTGGGGAATC AG AGAATTTC CAAACTTGTT TCTCAGACTT CCGCAGATCT CATCACTTTG 8280 
ATTTCTAATC C ATGCTGTAT TGGTGATTTT GTTTATCGTT CCTGTAACTT GTTCTAC ATT 8340 
CCACAGTCTT TACCGTTTTA TGTTCAAAAT TACAACAATC CCTGTCCATT GATTCCACTC 8400 
TGG AACTCTT TGTTCATGCC AATTTTG AAA TTTTAATACG AGCCTTCAAA TAAACACAGA 8460 
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AAAOAAAAAA AAAAAAAAAA AAAAAAAA 

c SEQ ID Nfr16S P EZ6 Protein sequence 

5 Protein Accession*: BAA82974.1 . 

1 II 21 31 41 51 
in I I I I I I 

1U MMMNVPGGG A AAVMMTGYNN GRCPRNSLYS DCIIEEKTW LQKKDNEGFG FVLRGAKADT 60 
PEEFTPTPA FPALQYLESV DEGGVAWQAG LRTGDFUEV NNENWKVGH RQWNMIRQG 120 
GNHLVLKVVT VTRNLDPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEE3 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNSV YERQGIAVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSIDSRIF LSGITEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQSVP 300 

ID PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNS AAKVSPA TRSDTVATMM REKGMYFRRE 360 

LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGKI ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIPIPTU VKEPSTSSSG KSSQGSSMEI DPQAPEPPSQ LRPDESLTVS 480 
SPFAAALAG A VROREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADB 540 
DSAEQLSSPM PSATPREPEN HFVGGAEASA PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 

20 AGPGNYVHPL TGRIXDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMLIDIMDT SQQKS AGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQISAAP EPTTVPGRTI 780 
VAVGSMEEAV OPFRIPPPP LASVDLDEDF IFTEPLPPPL EFANSFDIPD DRAASVPALS 840 
c DLVKQKKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCtPAS FLPPPESFDA VADSGffiEVD 900 

25 SRSSSDHHLE TTSTCSTVSS ISTLSSEGGE NVDTCTVYAD GQAEMVDKPP VPPKPKMKPI 960 
IHKSNALYQD ALVEEDVDSF VIPPPAPPPP PGS AQPGMAK VLQPRTSKLW GDVTEIKSPI 1020 
LSGPKANVIS ELNSOjQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPEIM STISGTRSTT 1080 
VTFTVRPGTS QP1TLQSRPP DYESRTSGTR RAPSPVVSPT EMNKETLPAP LS AATASPSP 1 140 

0 A ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPDVADWL 1200 

JO ESLNLGEHKE AFMDNEIDGS HLPNLQKEDL IDLGVTRVGH RMNffiRALKQ LLDR 

SEQ 10 N0:166 PEZ4 DNA SEQUENCE 

Nucleic Acid Accession #: NM.000G24 
3 5 Coding sequence: 220-1461 (underlined sciences correspond to start and stop codora) 

1 11 21 31 41 51 

I I I I I t 
. n ACTGCGAAGC GGCTTCTTCA G AGCACGGGC TGGAACTGGC AGGCACCGCG AGCCCCTAGC 60 

40 ACCCGACAAG CTGAGTGTGC AGG ACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTOCG CCCGCTGAGG 1 80 
CGCCCCCAGC CAGTGCOCTT ACCTGCCAGA CTGCGCGCCAJQGGGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGCGCCGG ACCACGACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCGT CCTGGCCATC 360 

45 GTGTTTGGCA ATGTGCTGGT CATC AC AGCC ATTGCCAAGT TCG AGCGTCT GCAG ACGGTC 420 
ACCAACTACT TCATC ACTTC ACTGGCCTGT GCTG ATCTGG TCATGGGCCT GGCAGTGGTG 480 
CCCTTTGGGG CCGCCCATAT TCTTATG AAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGGACTT CCATTG ATGT GCTGTGCGTC ACGGCCAGCA TTGAGACCCT GTGCGTG ATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAGAGCCT GCTGACCAAG 660 

50 AATAAGGCCC GGGTGATCAT TCTGATGGTG TGGATTGTGT CAGGCCTTAC CTCCTTCTTG 720 
CCCATTCAGA TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGACCTGCT GTGACTTCTT CACGAACCAA GCCTATGCCA TTGCCTCTTC CATCGTGTCC 840 
TTCTACGTTC CCCTGGTG AT CATGGTCTTC GTCTACTCCA GGGTCTTTCA GG AGGGCAAA 900 
AGGCAGCTCC AGAAGATTGA CAAATCTGAG GGCCGCTTCC ATGTCCAGAA CCTTAGCCAG 960 

55 GTGGAGCAGG ATGGGCGGAC GGGGCATGGA CTCCGCAGAT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCACCCT CTGCTGGCTG 1080 
CCCTTCTTCA TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCG TAAGGAAGTT 1 140 
TACATCCTCC TAAATTGGAT AGGCTATGTC AATTCTGGTT TCAATCCCCT TATCTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCC AG GAGCTTCTGT GCCTGCGCAG GTCTTCTTTG 1260 

60 AAGGCCTATG GGAATGGCTA CTCCAGCAAC GGCAACACAG GGGAGCAGAG TGGATATCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGGTACTGT GCCTAG COAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
ACAAATGACT CACTGCT GTA AA GCAGTTTT TCTACTTTTA AAGACCCCCC CCCCCCCAAC 1500 

- _ AGAACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGAATAAAA TTGTAAAAAT 1560 

05 TGTATAGAGA TATGCAGAAG OAAGGGCATC CTTCTGCCTT TTTTATTTTT TTAAGCTGTA 1620 
AAAAGAGAGA AAACTTATTT GAGTOATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTA CCTC A CTATTCAAO TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA OCCTTGGACT TGAGGATTTT 1860 

70 GAGTATCTCG GACCTTTCAG CTGTGA ACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTGAGO AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 

75 SEQ ID NQ:167 PE24 Protein sequence: 
Protein Accession #: NP_000015.1 

1 11 21 31 41 51 
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I I I I I I 

MGQPGNGSAF LLAPNRSHAP DHDVTQQRDE VWWGMGIVM SLIVLAIVFG NVLVITAIAK 60 
FERLQTVTNY FITS LAC ADL VMGLAVVPFG AAHILMKMWT FGNFWCEFWT SIDVLCVTAS 120 
ETLCVIAVD RYFAITSPFK YQSLLTKNKA RVHLMVWIV SGLTSFLPIQ MHWYRATHQE 180 
AINCYANETC CDFFTNQAYA 1A5SIVSFYV PLVIMVFVYS RVFQBAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SSKFCLKEHK ALKTLGIIMG TFTLCWLPFF IVNIVHVIQD 300 
NURKEVYIL LNWIGYVNSG FNPUYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSG YHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNID SQGRNCSTND SIX 



SEQ ID NO:168 PE21 ONA SEQUENCE 

Nucleic Acid Accession #: NM.004457 
15 Coding sequence: 143-2305 (undefined setjuences correspond to start and stopcodons) 

1 11 21 31 41 51 
on I I I I I I 

ZU GAATTCGTTG TTGGG AAGG A CTGGGGAAAC AGCTGTAACA TTTGCCACCC TCAGAAGCTG 60 
CTGGTCCTGT GTCA CACCAC CTTAGCCTCT TGATCGAGGA AGATTCTCGC TGAAGTCTGT 120 
TAATTCTACT TTTTGAGTAC TTAIQAATAA CCACGTGTCT TCAAAACCAT CTACCATGAA 180 
GCTAAAACAT ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATTCTA ACATACATTC CGTTTTATTT TTTCTCCG AG TCAAG ACAAG AAAAATCAAA 300 

25 CCGAATTAAA GCAAAGCCTG TAAATTCAAA ACCTGATTCT GCATACAGAT CTGTTAATAG 360 
TTTGGATGGTTTGGCTTCAG TATTATACCC TGGATGTGAT ACTTTAGATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AGAACAAAAG ACTCTTGGGA ACACGTGAAG TTTTAAATGA 480 
GGAAG ATG AA GTACAACCAA ATGGAAAAAT TITTAAAAAG GTTATTCTTG GACAGTATAA 540 
TTGGCTTTCC TATGAAG ATG TCTTTGTTCG AGCCTTTAAT TTTGGAAATG GATTACAGAT 600 

3U GTTGGGTCAG AAACCAAAGA CCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GCCATTGTTC ATGCATTAAA TGAAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAG AACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 

0 CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 

35 GCATACCATG GCTGCAGTGG AGGCCCTGGG AGCCAAGGCC AGCATGGAAA ACCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TCCAAAGGGA GTCATGATCT CACATAGTAA CATTATTGCT GGTATAACTG GGATGGCAGA 1080 
AAGGATTCCA G AACTAGG AG AGG AAGATGT CTAC ATTGG A TATTTGCCTC TGGCOCATGT 1140 

An TCTAOAATTA AGTGCTGAGC TTGTCTGTCT TTCTCACGGA TGCCGCATTG GTTACTCTTC 1200 

4U ACCACAGACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GGAAGCAAAG GGGATACATC 1260 
CATGTTGAAA CCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG AAATG AGTAG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGGAACAGA TTTCAAAAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
. TTTCCGG AAA GTTCGAAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1500 

45 TCC ACTTTCT GC AACCACGC AGCG ATTCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 

GGGATACGGG CTCACTGAAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGGACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGGATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGGGGCCA 1740 
AAGTGTG ACA ATGGGGTACT ACAAAAATG A AGCAAAAACA AAAGCTG ATT TCTCTG AAG A 1800 

5U TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTGAAC CCGATGGATG 1860 
CTTAAAGATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTCCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGGAACTAAC 2040 

_ TGAACTAGCT CGAAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTOA 2100 

55 AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTGCT ATTTCAGCAA GTCTGGAAAA 2160 
GTTTGAAATT CCAGTAAAAA TTCGTTTG AG TCCTGAACCG TGGACCCCTG AAACTGGTCT 2220 
GGTG ACAGAT GCCTTCAAGC TG AAACGC AA AG AGCTTAAA ACAC ATTACC AGGCGG ACAT 2280 
TGAGCGAATG TATGOAAGAA AAIAATTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGGAAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTCCATTC 2400 

00 CTCATATTAA ACTATTACTT CTCATGACGT CACCATTTTT AACTGACAGG ATTAGTAAAA 2460 
CATTAAGACA GCAAACTTGT GTCTGTCTCT TCTTTCATTT TCCCCGCCAC CAACTTACTT 2520 
TACC ACCTAT G ACTGTACTT GTCAGTATGA G AATTTTTCT G AATCATATT GGGG AAGC AG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTOTATAAT GTTCAGTTTG 2640 

- - TAACTTTTTA AAAGTTTGGA TGTATAGAGG GATAAATAGG AAATATAAG A ATTGGTTATT 2700 

05 TGGGGGCTTT TTTACTTACT GTATTTAAAA ATACAACGGT ATTGATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TG AAC AAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA G AAATA AATA TACCC ATACT TATGTTTTAA GAAGTTGAGA 2880 
TCTTGTG AAT ATATGCCTGT CAGTGTCTTC TTTATATATT TAU1UTAT TAG AAAAAAT 2940 

_ n GAAGTTTGGT TGGTGATGCA TGAAACAAAA TAGCAAGAGA GGGTTATAGT TTAATAGTAA 3000 

7U GGGAGATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG G ATAG AATTT AAAG AACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTGAG TTATCTATAT TTGTAAACAA ATTAGTC ATG 3240 
- GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAGATA ATCATTATTT CATTTTAAAA 3300 

/ 5 TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGGATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCACCTTATC CTTTCTCCCA TAACCTTTGT AC ACTAAAAA ATG AAAG AAT 3420 
TTAGAATGTA TTTGATGATA GCATTCTCAC TAAG ACAC AT G AO AATTTAA CTTTATAACC 3480 
GCGTGAGTTA AGATTTAATT CATAGGTTTT GATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
OAAACCTTGC TTGTGTGATA CATAGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTGOAT TATCAAAAGC AATAGTGCAC CAATTAAGAT GTGCTCAAAT CAGGACTTAA 3660 
ATCATAGGCA CCACATTTTT CATGTCAGAC TAGTTACTTT GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 



SEQIONai69 PP1 Prrt^ngggwngg; 
Protein Accession tf: NP.004448,1 



tl 



31 



41 



51 



21 

I I I I I I 

MNNHVSSKPS TMKLKHTINP HXYFTHFU SLYTtLTYIP FYFFSESRQE KSNRIKAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTREV LNEEDEVQPN 120 
GKIFKKVILG QYNWLSYEDV FVRAFNFGNG LQMLGQKPKT NIAIFCETRA EWMIAAQACF 180 
MYNPQLVTLY ATLGGPAIVH ALNETEVTN1 ITSKELLQTK LKDIVSLVPR LRHITTVDGK 240 
PPTWSDFPKG IIVHTMAAVE ALG AKASMEN QPHSKPLPSD IAVIMYTSGS TGLPKGVMIS 300 
HSNHAGITQ MAERIPELGE EDVYIGYLPL AHVLELSAEL VCLSHGCRIG YSSPQTLADQ 360 
SSKIKKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQI 420 
SKGRNTPLCD SFVFRKVRSL LGGNIRLLLC GGAPLSATTQ RFMNICFCCP VGQGYGLTES 480 
AGAGTISEVW DYNTGRVGAP LVCCEIKLKN WEEGGYFNTD KPHPRGEHJ GGQSVTMGYY 540 
KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLKIIDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDNI CAYANSYHSY VIGFWPNQK ELTELARKKG LKGTWEELCN SCEMENEVLK 660 
VLSEAAISAS LEKFEIPVKI RLSPEPWTPB TGLVTDAFXL KRKELKTHYQ ADEERMYGRK 



Nucleic Add Accession*: 



SEQ ID N0:170 PCQ7 DMA SEQUENCE 

none found 

38-1075{undertIned sequence corresponds to start and stop codon) 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



AGCAACGACG 
CCTGCTGCTG 
GTGCAACATA 
6TGTGACGGG 
GTCGAAATGT 
CTTCCGGTGC 
AAACCCTCTG 
GAGCTTCATC 
AAGTTCTCAA 
TTACCCCAGC 
CCTGCTGGCA 
GCACCGGCTG 
CTGCAACGTC 
GAATGCGTCO 
TGCGTGGTAT 
CGACCTGCCC 
CAGCAGCCTC 
GGGCACTGCT 
AGTTATTCCA 
TGCTCATGGG 
AACTATCTCT 
TGACATGATC 
CACCCTCATT 
AAATAGGCTG 
CGCTGGACCC 
ATGATCTAAC 
ATCAAAACCT 
AAGAAAACTT 
AAGGACTCTG 
CTCATTCTGA 
GAGCCCCTCC 
TACACCTGCC 
ACCTGCCCQT 
GTATGTCCCT 
CTCCAAAGTT 
ACTGGTTTCT 
CTGCACTGTG 
GGTCAGGGTC 
AGACAATTTG 
TGAAACAGTG 
AGCTGTCTCT 
ACACCCTTGC 
ACATTTGTGC 
AGAGGGACTC 
TTCTCTGTGT 
AGGTGTTGTT 
CCACTCCGGG 
AACCTGTTTG 
TGATCCTGTT 



11 
I 

CCGGGCAGCG 
AGCAGCGCCG 
CCAGGCAACT 
CTGCCTGACT 
GGCCCAACCT 
AATGGGTTTG 
CTTTGCTCCA 
TGCGATGGAC 
GAACCCGGCA 
ATCACCTATG 
CTGGTCTTGC 
CAGCACCCTG 
ACCTACAACG 
GAAGTAGGCT 
GACCTTCCTC 
CCCTACCGCT 
CTGAGCGTGG 
GAGCCCAGGG 
AAGTCCATAT 
AAGCTCTTTA 
GCATTCCCCT 
TGTTGTGCGT 
TTTCACATTA 
GGAGAGAGCA 
AATTCTCTCT 
CAGGAGGCCA 
GCTTTGCACA 
TGGACGTGAG 
AAACCATCTA 
GAGCTTTCCT 
CATGAGTTTA 
CTGGCTCTAC 
AGCCAAGGAA 
GTGGCCCACA 
CCCTTAACAC 
ATCACAGGTG 
CACGCTCCTC 
AGGCCTCTCC 
GAGTCAAGAT 
TGTTTGTTTT 
TTTTTTGTTT 
CCCGCTGAGC 
ATTGTTGCAC 
CTCTCTCCCT 
CCAGTCAGCC 
TGGCAAGAAA 
CAGCTGTCAC 
ACGCTAATTA 
CTGTAGACTT 



AAGACACCAG 
ACTCTGAGCC 
GGGTTAATCT 
AGCACCTGTA 
CCTCCCCCAG 
CTTTTCTGTC 
TTCTGTTTCT 
ATGTTTCTGT 
GCTGGGTAGT 
TCACTGGATG 
ATCCTATTTG 
TAACACCCTT 
CCCTGTATAA 
CAGCAGCATA 
TCCAAGTTCT 
AGCCACTTAC 
TGAGGACCTA 
CCCAGCCTGT 
TTGCAAAGTC 
AGAGCCATGT 
TTCCCAAGGT 
CAACATCCCA 
TTTCCATTTG 
TTCCCTTCTA 
TTCCTTTAAC 
CCCGTGATAA 
TTTGAGGTTA 
CCGTGTATAG 
ACAGGGCCCG 
CCACACTGAC 
CCATTCAGAA 
AAACAGAGCC 
TTCTTTCTTT 



31 
I 

CCGCGCCATG 
GCTGCTCCCC 
CAATGGACGG 
GAGTGATGAG 
TGCCAGCGGC 
CGATGGCAGC 
CCACTGCAAG 
TCAAGACAAC 
GTTTGTGACT 
CAGCTCCGTC 
GAAGCGGAAC 
CCGCCTGGTG 
CATCCAGTAT 
CTACTCCGAG 
CTCTTCTGAC 
GAGTGCCAAC 
CCACAGCCCG 
CAGCCAGGGC 
GCTCTGACTT 
AGGATGTCTC 
ACTTCAGAGA 
AGGTCACTCT 
GTTGGAGAGA 
GCTATATTGG 
TACCTTATAG 
GTCACCCCCC 
ATGCCCCCAG 
CAGCAGTCGC 
ATTCTGGCTT 
TATCATCAGC 
CAGCTCCTAA 
CTGGTTTCTG 
ACTTGAGTTG 
CTTGCTCATT 
CTTTTTACCT 
TCAATACCTC 
CCCAATACCA 
GTAGTTTCTC 
GATCTATTTT 

gttaagggac 
aaggtccaaa 
caagtcactc 

TTATTTATCA 
TCTCTATGTT 
CCTCCCTGCA 
TGATGAGGGG 
CTTCTTTCCG 
TGCAGGAAGT 
TTTTAACCAA 



41 

I 

TGGCTGCTGG 
GGGAACAACT 
TGCATCCCGG 
AAGGAGTGCC 
ATCCATTGCA 
GATGAAGAGA 
AACGGCCTCT 
AGTGATGAGG 
TCAGAGAACC 
ATTTTTGTGC 
AACCTCATGA 
GTCCTGGACC 
GTGGCCAGCC 
GCCTTGCTGG 
ACGGAATCTC 
AGTGCCAGCT 
GGGCAGCCTG 
ACTGAAGAAG 
GTTGCCATTC 
AAGTTACAGT 
TGTTTTTCTG 
TCCCTTGGGA 
CAGCATATAA 
ATGCTCAGAA 
CATTTGGGGA 
CAAAAAAATT 
TTCAGCAGAG 
AACGTTATTT 
TAGAAATTTG 
CTCATCCTAA 
AATGCAGGCT 
GACTGTCACC 
GCCCAAAGTC 
CATGCAGCCT 
GTGCATTTGG 
CAGCAAGCTC 
GCACCTCTAG 
CTCTGAGACA 
AAATCTTTTA 
TATTTATATG 
GAAAGATGCA 
CAGACTAACC 
AGTTCTTGAA 
TGTGCTAGTT 
GGAATAAGGG 
TAAAATGGAA 
CAGCTGAAGA 
GGGGCTAAAG 
ATCCAAAGGA 



51 

I 

GGCCGCTGTG 
TCACCAATGA 
GCGCCTGGCA 
CCAAGGCTAA 
TCATTGGTCG 
ACTGCACAGC 
GTATTGACAA 
AAAGCTGTGA 
AACTTGTGTA 
TGGTGGTGGC 
OGCTGCCCGT 
ACCCCCACCA 
AGGCGGAGCA 
ACCAGAGGCC 
TGAACCAAGC 
CCCAGGCAGC 
GCCCCCAGGA 
TATAAGTCCC 
TAACAATTTG 
TTGGGATATT 
GCGTCTCAGT 
CCCGAGATCA 
AACAGTATTG 
GTGCAGGAGA 
TTTGGGTTAG 
CCATTTGAGC 
TCAGTGGCCA 
TGGTTTTGTG 
CCCAAGAATG 
AATAGGCAGG 
GCCAAGACCC 
CTCCCAGCTG 
TGACCTGGCT 
CAACACTGGC 
ACTTGAGGAC 

TTAGAGTTAG 
CATGGGCAAG 
GAAATGCATT 
TGTATAGGAA 
AAAGGAGATC 
TGTGTGCCAG 
GGAAGCAGAA 
TTTCTTTTTT 
GTAAAACGTT 
CCAGGTAGAG 
AATGTTCAGT 
TGGCATTCAQ 
TGTTACAGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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AA6CTAGCCA CTGGTATTTT GTTTTGTTTA AAAAAAAAAA GAAAGAAAGA AAGAAAGAAA 3000 

AACGGAAAGG AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAQTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAGAAT GTGTTAGAAC TCTGGCTGAA 3120 

CATTTCATCT CCTGTGAGTC A6AAGG6CTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TTTCTGGTGC TCTGGAAGTT GTTTAGAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3240 

GAGTTAATCT CACTCGCTTT TCTGCTTCCA GGCATCTTAG GAAAAACAAA TGGTTTTAGT 3300 

AGATAAGGGA TGCCTACTAA TGCTTTTTTA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TGATTTTTTT AATGAATGTT TTTAAAAATA TATAAATAGG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTGGGGG GAGGGGGTTT GTTTTCCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA CGCCAACCAG 3540 

AAAATAGTCT CATCTCTTTT TTTCTCAAAT GAGATCCGTG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTG TGATGACTGG CCTATTACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

TTTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTG 3720 

GAAAGGTTGT GTGTCGTTGC TTTTTGTGTT TTGGTTAGGC TTGGTTTTGT TTTTTAATTT 3780 

TTATACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTGCAAAWG GWMCTAMARM 3840 

AAMMAAAAAC AWYWTTGGGG GGGCTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

TGGGGCGGCG GGGCCCACGT AGGTACGGCG ACCACGCGGG CCCAAACGGG ACCCCAGAAG 3960 

GAAACCCTGG CCAAGAAAAA GGTGGCGAGA ATTCTCCACA CCAGAAAAAA ACGCGCCGGG 4020 

GGAAACCGCA GAGTGTTGCG TAAACCACAC CCGAAGAGAG AACTCAGAAG CACACAAGCG 4080 
GGACTCAACC AGGAGGACCC AAGGGAACCC GATAGAGTAC G 



STO IP Nfr 171 PCQ7 Pfflrti mtsm 

Protein Accession #: none round 

l 11 21 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAAESQLL FGNNFTNECN IPGNFMCSNG RCIPGAWQCD GLPDCFDKSD 60 
EKECPKAKSK CGPTFFPCAS GIHCIIGRFR CKGFEDCPDG SDEENCTANP LLC STAR YHC 120 
KNGLCIDKSF ICDGQNNCQD NSDEESCESS QBPGSGQVFV TSENQLVYYP SITYAIIGSS 180 
VIFVLWALL ALVLHHQRKR NNLMTLPVHR LQHPVLLSRL WLDHPHHCN VTYNVNNGIQ 240 
YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLKQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT AEPRDSEPSQ GTEBV 



SEQ ID NO:172 PELS DMA SEQUENCE 
Nucleic Acid Accession*: NM_005658.1 

Coding sequence: 57-1535 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I ! II I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GACCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCC CTATCCCGCA CAGCCCACTG TGGTCCCCAC TGTCTACQAG GTGCATCCGG 180 

CTCAGTACTA CCCGTCCCCC GTGCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATCCC CATCCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGACCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCTCTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT ACGGACCAAA CTTCATCCTT CAGATGTACT 540 

CATCTCAGAG GAAGTCCTGG CACCCTOTGT GCCAAGAOGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

ATGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCCGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGCCTGCGG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG 840 

CGCTCCCGGG GGCCTGGCCC TGGCAGGTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CATCACCCCC GAGTGGATCG TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG ACGGCATTTG CGGGGATTTT GAGACAATCT TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACCA 1080 

AGAACAATGA CATTGCGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCTGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCTGGATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTGCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAATA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACG GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT ACTCTTAGAG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TGGCACTCTC 1680 

TGCCATACTG TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTQT 1740 

CCGCAAGGGG TGATGGCCGG CTGGTTGTGG GCACTGGOGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GCCCCCATTG AGATCTTCCT GCTGAGTGCT TTCCAGGGGC CAATTTTGGA I860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGATGAAAAA GGAGAGACAT 1920 

GGAAAGGGAG ACAGCCAGGT GGCACCTGCA GCGGCTGCCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCCCAGCCT ACTTCACAAG GGGATTTTGC TGATGGGTTC TTAGAGCCTT AGCAGCCCTG 2040 

GATGGTGGCC AGAAATAAAG GGACCAGCCC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAT ATAGACAGTG CCCTTGGTGC 2160 



371 



WO 02/30268 



PCMJS01/32045 



GAGGGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGGG AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA GTTTGGCACC 2340 

- ATGTCGGCCT CTTCAGGCCT GATAGTCATT GGAAATTGAG GTCCATGGGG GAAATCAAGG 2400 

D ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 

CTGAGTTCAA AGCCATCTT 

1A SEQ ID KO:173 PEU Protein sequence: 

1U Protein Accession #; NP_00S647.1 

1 11 21 31 41 51 

tc I I I 1 I ! 

1J MALNSGSPPA IGPYYENHGY QPENPYPAQP TWPTVYEVH PAQYYPSPVP QYAPRVLTQA 60 

SNFWCTQPK SPSGTVCTSK TKKALCITLT LGTFLVGAAL AAGLLWKFMG SKCSNSGIEC 120 

DSSGTCINPS NWCDGVSHCP GGEDENRCVR LYGPNPILQM YSSQRKSWHP VCQDDWNENY 180 

GRAACRDMGY KNNPYSSQGI VDDSGSTSFM KLNTSAGNVD IYKKLYHSDA CSSKAWSLR 240 

on CLACGVNLNS SRQSRIVGGE SALPGAWPWQ VSLHVQNVHV CGGSIITPEW XVTAAHCVEK 300 

JX) PLNNPWHWTA FAGILRQSFM FYGAGYQVQK VISHPNYDSK TKNNDIALMK LQKPLTFNDL 360 

VKPVCLPNPG MMLQPEQLCW ISGWGATEEK GKTSEVLNAA KVLLIETQRC NSRYVYDNLI 420 

TPAMlCCAGFL QGNVDSCQGD SGGPLVTSNN NIWWLIGDTS WGSGCAKAYR PGVYGNVMVP 480 

TDWIYRQMKA NG 

25 

SEQ ID NO:174 PBJ4 DNA SEQUENCE 

Nucleic Acid Accession «: AI694767 

Coding sequence: 130-1086 (underlined sequences correspond to start and stop codons) 

30 i 11 21 31 41 51 

I I I I I I 

CAGAGAGGCT GTATTTCAGT GCAGCCTGCC AGACCTCTTC TGGAGGAAGA CTGGACAAAG 60 

GGGGTCACAC ATTCCTTCCA TACGGTTGAG CCTCTACCTG CCTGGTGCTG GTCACAGTTC 120 

AGCTTCTTCA TGA TGGTGGA TCCCAATGGC AATGAATCCA GTGCTACATA CTTCATCCTA 180 

ATAGGCCTCC CTGGTTTAGA AGAGGCTCAG TTCTGGTTGG CCTTCCCATT GTGCTCCCTC 240 

TACCTTATTG CTGTGCTAGG TAACTTGACA ATCATCTACA TTGTGCGGAC TGAGCACAGC 300 

CTGCATGAGC CCATGTATAT ATTTCTTTGC ATGCTTTCAG GCATTGACAT CCTCATCTCC 360 

ACCTCATCCA TGCCCAAAAT GCTGGCCATC TTCTGGTTCA ATTCCACTAC CATCCAGTTT 420 

GATGCTTGTC TGCTACAGAT GTTTGCCATC CACTCCTTAT CTGGCATGGA ATCCACAGTG 480 

CTGCTGGCCA TGGCTTTTGA CCGCTATGTG GCCATCTGTC ACCCACTGCG CCATGCCACA 540 

GTACTTACGT TGCCTCGTGT CACCAAAATT GGTGTGGCTG CTGTGGTGCG GGGGGCTGCA 600 

CTGATGGCAC CCCTTCCTGT CTTCATCAAG CAGCTGCCCT TCTGCCGCTC CAATATCCTT 660 

TCCCATTCCT ACTGCCTACA CCAAGATGTC ATGAAGCTGG CCTGTGATGA TATCCGGGTC 720 

AATGTCGTCT ATGGCCTTAT CGTCATCATC TCCGCCATTG GCCTGGACTC ACTTCTCATC 780 

TCCTTCTCAT ATCTGCTTAT TCTTAAGACT GTGTT6GGCT TGACACGTGA AGCCCAGGCC 840 

AAGGCATTTG GCACTTGCGT CTCTCATGTG TGTGCTGTGT TCATATTCTA TGTACCTTTC 900 

ATTGGATTGT CCATGGTGCA TCGCTTTAGC AAGCGGCGTG ACTCTCCACT GCCCGTCATC 960 

TTGGCCAATA TCTATCTGCT GGTTCCTCCT GTGCTCAACC CAATTGTCTA TGGAGTGAAG 1020 

ACAAAGGAGA TTCGACAGCG CATCCTTCGA CTTTTCCATG TGGCCACACA CGCTTCAOAG 1080 

CCCTAGGTGT CAGTGATCAA ACTTCTTTTC CATTCAGAGT CCTCTGATTC AGATTTTAAT 1140 

GTTAACATTT TGGAAGACAG TATTCAGAAA AAAAATTTCC TTAATAAAAA TACAACTCAG 1200 

ATCCTTCAAA TATGAAACTG GTTGGGGAAT CTCCATTTTT TCAATATTAT TTTCTTCTTP 1260 

GTTTTCTTGC TACATATAAT TATTAATACC CTGACTAGGT TGTGGTTGGA GGGTTATTAC 1320 

TTTTCATTTT ACCATGCAGT CCAAATCTAA ACTGCTTCTA CTGATGGTTT ACAGCATTCT 1380 

GAGATAAGAA TGGTACATCT AGAGAACATT TGCCAAAGGC CTAAGCACAG CAAAGGAAAA 1440 

TAAACACAGA ATATAATAAA ATGAGATAAT CTAGCTTAAA ACTATAACTT CCTCTTCAGA 1500 

ACTCCCAACC ACATTGGATC TCAGAAAAAT ACTGTCTTCA AAATGACTTC TACAGAGAAG 1560 

AAATAATTTT TCCTCTGGAC ACTAGCACTT AAGGGGAAGA TTGGAAGTAA AGCCTTGAAA 1620 

AGAGTACATT TACCTACGTT AATGAAAGTT GACACACTGT TCTGAGAGTT TTCACAGCAT 1680 

ATGGACCCTG TTTTTCCTAT TTAATTTTCT TATCAACCCT TTAATTAGGC AAAGATATTA 1740 

TTAGTACCCT CATTGTAGCC ATGGGAAAAT TGATGTTCAG TGGGGATCAG TGAATTAAAT '1800 

GGGGTCATAC AAGTATAAAA ATTAAAAAAA AAAGACTTCA TGCCCAATCT CATATGATGT 1860 

GGAAGAACTG TTAAAGAGAC CAACAGGGTA GTGGGTTAGA GATTTCCAGA GTCTTACATT 1920 

TTCTARAGGA GGTATTTAAT TTCTTCTCAC TCATCCAGTG TTGTATTTAG GAATTTCCTG 1980 

GCAACAGAAC TCATGGCTTT AATCCCACTA GCTATTGCTT ATTGTCCTGG TCCAATTGCC 2040 

AATTACCTGT GTCTTGGAAG AAGTGATTTC TAGGTTCACC ATTATGGAAG ATTCTTATTC 2100 

AGAAAGTCTG CATAGGGCTT ATAGCAAGTT ATTTATTTTT AAAAGTTCCA TAGGTGTTTC 2160 

TGATAGGCAG TGAGGTTAGG GAGCCACCAG TTATGATGGG AAGTATGGAA TGGCAGGTGT 2220 

TGAAGATAAC ATTGGCCTTT TGAGTGTGAC TCGTAGCTGG AAAGTGAGGG AATCTTCAGG 2280 

ACCATGCTTT ATTTGGGGCT TTGTGCAGTA TGGAACAGGG ACTTTGAGAC CGGGAAAGCA 2340 

ATCTGACTTA GGCATGGGAA TCAGGCATTT TTGCTTCTGA GGGGCTATTA CCAAGGGTTA 2400 

ATAGGTTTCA TCTTCAACAG GATATGACAA CAGTCTTAAC CAAGAAACTC AAATTACATA 2460 

TACTAAAACA TGTGATCATA TATGTGGTAA GTTTCATTTT CTTTTTCAAT CCTCAGGTTC 2520 

CCTGATATGG ATTCCTATNA CATGCTTTCA TCCCCTTTTG TAATGGATAT CATATTTGGA 2580 

AATGCCTATT TAATACTTGT ATTTGCTGCT GGACTGTAAG CCCATGAGGG CACTGTTTAT 2640 

TATTGAATGT CATCTCTGTT CATCATTGAC TGCTCTTTGC TCATCATTGA ATCCCCCAGC 2700 

AAAGTGCCTA GAACATAATA GTGCTTATGC TTGACACCGG TTATTTTTCA TCAAACCTGA 2760 

TTCCTTCTGT GCTGAACACA TAGCCAGGCA ATTTTCCAGC CTTCTTTGAG TTGGGTATTA 2820 

TTAAATTTTA GCCATTACTT CCAATGTGAG TGGAAGTGAC ATGTGCAATT TTTATACCTG 2880 

GCTCATAAAA CCCTCCCATG TGCAGCCTTT CATGTTGACA TTAAATGTGA CTTGGGAAGC 2940 
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TATGTGTTAC ACAGAGTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



SEQ ID NO:175 PBJ4 PROTEIN SEQUENCE 
Protein Accession #: not available, cloned at Eos 

1 11 21 31 41 51 

I I I I E I 

MVDPNGNESS ATYFILIGLP GLEEAQFVJLA FPLCSLYLIA VLGNLTIIYI VRTEHSLHEP 60 

MYIFLCMLSG IDILISTSSM PKMLAIFWFN STTIQFDACL LQMFAIHSLS GMBSTVLLAM 120 

AFDRYVAICH PLRHATVLTL PRVTKIGVAA WRGAALMAP LPVFIKQLPF CRSNILSHSY 180 

CLHQDVMKLA CDDIRVNWY GL1VIISAIG LDSLLISFSY LLILKTVLGL TRBAQAKAFG 240 

TCVSHVCAVP IFYVPFIGLS MVHRFSKRRD SPLPVILANI YLLVPPVLNP IVYGVKTKBI 300 
RQRILRLFHV ATHASEP 

SEQ ID NO:176 PM72 DNA SEQUENCE 
Nucleic Acid Accession #; NM.004624. 1 

Coding sequence: 57-1 544 (underlined sequences correspond to start and stop codons) 

TCGGAGCCTG CGGAGGGTGG TGGTGGTGGT GGTGGTGGCC CTCGCCCGCC TCACTCATGC 60 

CTCCTCCTCC TCTGCTCTCG CTCAGGCGCC TCGGTGGCGG TTGGTCGGCG GTTACGCGGC 120 

TGGTGGTCGC GGCGGCCGGG GCTCGCTCTC GGGGAGGCCG GGGCGGATCT CGCGGCGCAG 180 

GCGGCGGCGG CCGAGGTGGG GTCGCGCGGG GGAGGCGGCT CGAGCTTCGT GCTGCGCGCT 240 

CGCTCTTGGG CTCCTCGCTG CAGGAGGAGT GTGACTATGT GCAGATGATC GAGGTGCAGC 300 

ACAAGCAGTG CCTGGAGGAG GCCCAGCTGG AGAATGAGAC AATAGGCTGC AGCAAGATGT 360 

GGGACAACCT CACCTGCTGG CCAGCCACCC CTCGGGGCCA GGTAGTTGTC TTGGCCTGTC 420 

CCCTCATCTT CAAGCTCTTC TCCTCCATTC AAGGCCGCAA TGTAAGCCGC AGCTGCACCG 480 

ACGAAGGCTG GACGCACCTG GAGCCTGGCC CGTACCCCAT TGCCTGTGGT TTGGATGACA 540 

AGGCAGCGAG TTTGGATGAG CAGCAGACCA TGTTCTACGG TTCTGTGAAG ACCGGCTACA 600 

CCATTGGCTA CGGCCTGTCC CTCGCCACCC TTCTGGTCGC CACAGCTATC CTGAGCCTGT 660 

TCAGGAAGCT CCACTGCACG CGGAACTACA TCCACATGCA CCTCTTCATA TCCTTCATCC 720 

TGAGGGCTGC CGCTGTCTTC ATCAAAGACT TGGCCCTCTT CGACAGCGGG GAGTCGGACC 780 

AGTGCTCCGA GGGCTCGGTG GGCTGTAAGG CAGCCATGGT CTTTTTCCAA TATTGTGTCA 840 

TGGCTAACTT CTTCTGGCTG CTGGTGGAGG GCCTCTACCT GTACACCCTG CTTGCCGTCT 900 

CCTTCTTCTC TGAGCGGAAG TACTTCTGGG GGTACATACT CATCGGCTGG GGGGTACCCA 960 

GCACATTCAC CATGGTGTGG ACCATCGCCA GGATCCATTT TGAGGATTAT GGTCTGCTCA 1020 

GGTGCTGGGA CACCATCAAC TCCTCACTGT GGTGGATCAT AAAGGGCCCC ATCCTCACCT 1080 

CCATCTTGGT AAACTTCATC CTGTTTATTT GCATCATCCG AATCCTGCTT CAGAAACTGC 1140 

GGCCCCCAGA TATCAGGAAG AGTGACAGCA GTCCATACTC AAGGCTAGCC AGGTCCACAC 1200 

TCCTGCTGAT CCCCCTGTTT GGAGTACACT ACATCATGTT CGCCTTCTTT CCGGACAATT 1260 

TTAAGCCTGA AGTGAAGATG GTCTTTGAGC TCGTCGTGGG GTCTTTCCAG GGTTTTGTGG 1320 

TGGCTATCCT CTACTGCTTC CTCAATGGTG AGGTGCAGGC GGAGCTGAGG CGGAAGTGGC 1380 

GGCGCTGGCA CCTGCAGGGC GTCCTGGGCT GGAACCCCAA ATACCGGCAC CCGTCGGGAG 1440 

GCAGCAACGG CGCCACGTGC AGCACGCAGG TTTCCATGCT GACCCGCGTC AGCCCAGGTG 1500 

CCCGCCGCTC CTCCAGCTTC CAAGCCGAAG TCTCCCTGGT CTGACCACCA GGATCCCAGC 1560 

CCAAGCGGCC CCTCCCGCCC CTTCCCACTC GCAGCAGACG CCGGGGACAG AGGCCTGCCC 1620 

GGGCGCGCCA GCCCCGGCCC TGGGCTCGGA GGCTGCCCCC GGCCCCCTGG TCTCTGGTCC 1680 

GGACACTCCT AGAGAACGCA GCCCTAGAGC CTGCCTGGAG CGTTTCTAGC AAGTGAGAGA 1740 

GATGGGAGCT CCTCTCCTGG AGGATGCAGG TGGAACTCAG TCATTAGACT CCTCCTCCAA 1800 

AGGCCCCCTA CGCCAATCAA GGGCAAAAAG TCTACATACT TTCATCCTGA CTCTGCCCCC 1860 

TGCTGGCTCT TCTGCCCAAT TGGAGGAAAG CAACCGGTGG ATCCTCAAAC AACACTGGTG 1920 

TGACCTGAGG GCAGAAAGGT TCTGCCCGGG AAGGTCACCA GCACCAACAC CACGGTAGTG 1980 

CCTGAAATTT CACCATTGCT GTCAAGTTCC TTTGGGTTAA GCATTACCAC TCAGGCATTT 2040 

GACTGAAGAT GCAGCTCACT ACCCTATTCT CTCTTTACGC TTAGTTATCA GCTTTTTAAA 2100 

GTGGGTTATT CTGGAGTTTT TGTTTGGAGA GCACACCTAT CTTAGTGGTT CCCCACCGAA 2160 

GTGGACTGGC CCCTGGGTCA GTCTGGTGGG AGGACGGTGC AACCCAAGGA CTGAGGGACT 2220 

CTGAAGCCTC TGGGAAATGA GAAGGCAGCC ACCAGCGAAT GCTAGGTCTC GGACTAAGCC 2280 

TACCTGCTCT CCAAGTCTCA GTGGCTTCAT CTGTCAAGTG GGACTCTGTC ACACCAGCCA 2340 

TTCTTATCTC TCTGTGCTGT GGAAGCAACA GGAATCAAGA GACTGCCCTC CTTGTCCACC 2400 

CACCTATGTG CCAACTGTTG TAACTAGGCT CAGAGATGTG CACCCATGGG CTCTGACAGA 2460 

AAGCAGATCC TCACCCTGCT ACACATACAG GATTTGAACT CAGATCTGTC TGATAGGAAT 2520 

GTGAAAGCAC GGACTCTTAC TGCTAACTTT TGTGTATCGT AACCAGCCAG ATCCTCTTGG 2580 

TTATTTGTTT ACCACTTGTA TTATTAATGC CATTATCCCT GAATTCCCCT TGCCACCCCA 2640 

CCCTCCCTGG AGTGTGGCTG AGGAGGCCTC CATCTCATGT ATCATCTGGA TAGGAGCCTG 2700 

CTGGTCACAG CCTCCTCTGT CTGCCCTTCA CCCCAGTGGC CACTCAGCTT CCTACCCACA 2760 

CCTCTGCCAG AAGATCCCCT CAGGACTGCA ACAGGCTTGT GCAACAATAA ATGTTGGCTT 2820 
GGAAAAAAAA AAAA 

SEQ ID NO:177 PM72 Protein sequence: 

Protein Accession #: JC2195 

1 11 21 31 41 51 

I I I I I I 

MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLELRAA 60 

RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LENETIGCSK MWDNLTCWPA TPRGQVWLA 120 
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10 



CPLIFKLFSS IQGRNVSRSC TDEGWTHLEP GPYPIACGLD DKAASLDEQQ TMFYGSVKTG 180 

YTIGYGLSLA TLLVATAILS LPRKLHCTRN YIHMHLFISF ILRAAAVFXK DLALFDSGES 240 

DQCSEGSVGC KAAMVFFQYC VMANPFWLLV EGLYLYTLLA VSFFSERKYF WGYILIGWGV 300 

PSTFTHVWTI ARIHFEDYGL LRCWDTINSS LWWIIKGPXL TSILVNFILF ICIIRILLQK 360 

LRPPDIRKSD SSPYSRLARS TLLLIPLFGV HYIMFAFFPD NFKPEVKMVF ELWGSFQGF 420 

WAILYCFLN GBVQAELRRK WRRWHLQGVL GWNPKYRHPS GGSNGATCST GVSMLTRVSP 480 
GARRSSSFQA EVSLV 



Nucleic Acid Accession*: 

Coding sequence: 



SEQ ID NO:178 BFF8 DNA SEQUENCE 

AL133619 

1-2070 (underlined sequences correspond to start and stop codons) 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



1 11 21 31 

I 1 I I 

ATGAGCGGTG CGGGGGTGGC GGCTGGGACG CGGCCCCCCA 
CGGCGCCGGC GCCAGOGCCC CTCTGTGGGC GTCCAGTCCT 
CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG 
CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG 
GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGGCCGGCCC 
ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT 
GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG 
CCTGTATGCC AACCCAGTGG GTACAGGTTC TGGGGGACCT 
AGCCGTGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG 
GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT 
CCAAGTAGAG CTGAAATGGG AAGGAACCCC TGGGACAGCC 
CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC 
ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG 
GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT 
GCACTTCCCC ATCCTGACAG CGGCCCCCAC CCAGCCCAGG 
GCTCACTTCC CATTATCTTT GGGGCTGGGG CTGACATCAG 
TGGA6CCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA 
GACATGGAGA AGGGGGTTGA GGGAGGGCCC TTCCCTAGCC 
CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC 
GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT 
CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT 
GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC 
AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG 
GGCGGTAGCG CCGACACTGT GCOCTCTCCT GCAGACAGCC 
TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA 
TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA 
CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG 
GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG 
AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC 
ACCACACTTA GGCAGTGCGA AGTGCTCATC CGOGAGCTGT 
ACCCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA 
CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCACGC 
AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG 
CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA 
AAACGGCGCC TGCATCGCTC AGTGCTTIQA 



41 
I 

GCTCGCCGAC 
TGAGGCCGCA 
AGAAAAGCCT 
AGATCGAGCA 
TGCCTCCCCA 
CCAGCACACG 
CCCACCTGGC 
GGACAGATGC 
TGCTGCTCTC 
GCTCCCCAGA 
CCTGCCCTGC 
CTATGGCTCT 
GATCCCTTCC 
TTCCTTGCCA 
ATCCTGGGCT 
GAGGACATCT 
GGGCTCTCCC 
GCTGTGGCAA 
CCTGCAGTGC 
GCTGTTCCAT 
CCAGGGCCTC 
CGGGAGGACC 
GCAAGCGTGG 
TCTCCATGTC 
AGGCCAGGCC 
AGGCGGACCT 
TACAAGGGCA 
GGAACAGCCA 
CCCTTCCCCT 
GGAATACCAA 
GGCAGAGGCC 
ATTTCCCCAA 
AGCGTGCCAT 
AGAGGCTGCA 



51 
I 

CCCGGGCTCT 
GAGCCCGCAG 
GCAGTTCCTG 
TCTGAAGCGG 
GGCACACTCA 
CCTGGGCTCA 
TGCACTGGCC 
CGCTACCTCT 
GGGAAGCCCA 
CCTCCCTCCT 
TAGATCTTTG 
GAGTCCTCAC 
TGCCATCTGG 
CTTGTCCAAG 
GTGGTCTCAA 
GACTGGTGGA 
TTCCCAGGGA 
CTCCAGTGAG 
TGGGGACGCT 
GTGTCCCAAG 
TGCTCCCTTG 
CAGCCCTGCC 
GCGTCTTGCG 
AAGCTTCCAG 
CCAGCCCGGC 
GGAAGAGGAG 
GGCCAGAAAG 
GCACCAGGGC 
GCGAAAOCCC 
CCTCCTGCAG 
CCAGGCAGCC 
GGTCTCCACC 
CCTGCCCGCA 
GGCAATGCAG 



SEQ ID NO:179 PFFB projejn SWVgpcg; 
Protein Accession t. 



1 

I 

HSGAGVAAGT 
QQQHSEKLAK 
GG7QDGEPLQ 
GPEVIAGRQV 
MLGAQGIWTH 
AHFPLSLGLG 
LFWAKCGPSR 
GARWVCINGV 
SVKSISNSAN 
EKAEASNAGA 
TQELRHLKSL 
LKQTPKNNFA 



11 
I 

RPPSSPTPGS 
LHEEIEHLRR 
TVLAHLAALA 
ATGCSPDLPP 
SIQGSLPAIW 
LTSGGHLTGG 
QPQPCSAGDA 
WVEPGGPSPA 
SQGKARFQPG 
ACMGNSQHQG 
LEGSQRPQAA 
ERQKRLQAMQ 



21 
I 

RRRRQRPSVG 
ENKGEPARGP 
PVCQPSGVRF 
PSRAEMGRNP 
AATMGTKGGS 
WSQPGNIAAG 
DRTREEAMLS 
RLKEGSSRTH 
SFNKQDSKAD 
RQKGAGAHPP 
PEEASFPRDQ 
KRRLHRSVL 



T43457 



31 
I 

VQSLRPQSPQ 
RPALPPQAHS 
WGTWTDAATS 
WDSPCPARSL 
RVLFPCHLSK 
AVPRALPSQG 
LGTCCSMCPK 
RPGGKRGRLA 
VSQKADLBEE 
MILPLPLRKP 
EATKFPKVST 



41 
I 

LRQSDPQKRN 
TLPLPQHRNT 
SRGWTMLCSQ 
PQIAAVARPR 
ALPHPDSGPH 
DMEKGVEGGP 
PSCFPDGPSG 
GGSADTVRSP 
PLLHNSKLDK 
TTLRQCEVLI 
KSLSKKCLSP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



51 
I 

LDLEKSIfQFL 
AINSSTRLGS 
AQHVLLSGSP 
ISSPMALSPH 
PAQDPGLWSQ 
FPSRCGNSSE 
NHLSRASAPL 
ADSLSMSSFQ 
VFGVQGQARK 
RELWNTNLLQ 
PVAERAILPA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Nudeic Acid Accession «: 



SEQ 10 N0:180 BCR4 DNA SEQUENCE 
NM.012319.2 



Coding sequence: 



138-2405 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTCGTGCCGA ATTCGGCACG AGACCGOGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 



60 
120 
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GCGGAGACGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAQ ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTCC AAGACCTGGA AAACTCTTCC 780 

CCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAACAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 

SEQ ID HQiW BCR4 PROTEIN SEQUENCE 
Protein Accession*: NP.036451 

1 11 21 31 41 51 

I I I I 1 1 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISFNW ESGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVBGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCFDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVSBGTH PLETIETPRP GKLFPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NESVSEPRKG FKYSRNTNEN PQECFNASKL LTSHGMGIQV PLMATEFNVL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISFLSL LGVILVPLMN RVFFKFLLSF 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAMEMKR GPLFSHLSSQ NIEESAYFDS 420 

TWKGLTAIX5G LYFMFLVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQQPA VLEEEEVMIA HAHPQEVYNE YVPRGCKNKC 540 

HSHFHDTLGQ SDDLIHHHHD YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVZ 600 

KGDGLHNFSD G LAI GAAP TE GLSSGLSTSV AVPCHELPHE LGDFAVLLKA GMTVKQAVLY 660 

KALSAMLAYL GMATGIFIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEML HNDASDHGCS 720 
RWGYFFLQNA GMLLGFGIML LISIPEHKIV PR INF 
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SEQ ID MOilCPCrcP NA HWfflW 

Nucleic Add Accession #: NM.001203 

Coding sequence: 274-1782 (underlined sequences correspond to start and stop codons) 

S 1 It 21 31 41 51 
I I I I I I 

CGCGGGGCGC GGAGTCGGCG GGGCCTCGCG GGACGCGGGC AGTGCGGAGA CCGCGGCGCT 60 
GAGG ACGCGG GAGCCGGG AG CGCACGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAGACT TCTGCTGATT 180 
1 0 CATAACCATT TGGCTCTG AG CTATGACAAG AGAGGAAACA AAAAGTTAAA CTTACAAGCC 240 
TGCCATAAGT GAGAAGCAA A CTTCCTTGAT AACATGCTTT TGCGAAGTGC AGGAAAA1TA 300 
AATGTGGGCA CCAAG AAAGA GG ATGGTGAG AGTACAGCCC CCACCCCOCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
. GACGGATATT GTTTCACGAT GATAGAAGAG GATGACTCTG GGTTGCCTGT GGTCACTTCT 480 

15 GGTTGCCTAG GACTAGAAGG CTCAGATTTT CAGTGTCGGG ACACTCCCAT TCCTCATCAA 540 
AG AAGATCAA TTGAATGCTG CACAGAAAGG AACG AATGTA ATAAAGACCT ACACCCTACA 600 
CTGCCTCCAT TGAAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATTTTG TTACTTCCGG 720 
„ TATAAAAGAC AAGAAACCAG ACCTCGATAC AGCATTGGGT TAGAACAGGA TGAAACTTAC 780 
20 ATTCCTCCTG GAGAATCCCT G AG AGACTTA ATTGAGCAGT CTCAGAGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGGACT ATAGCTAAGC AGATTCAGAT GGTGAAACAG 900 
ATTGGAAAAG GTCGCTATSG GGAAGTTTGG ATGGGAAAGT GGCGTGGCGA AAAGGTAGCT 960 
GTGAAAGTGT TCTTCACCAC AGAGGAAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 1020 
„ ACAGTGTTGA TGAGGCATGA AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 
25 GGGTCCTGGA CCCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 1 140 
TATCTGAAGT CCACCACCCT AGACGCTAAA TCAATGCTGA AGTTAGCCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAGAAAA ATGGAACTTG CTGTATTGCT 1320 
0rt GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 1380 
30 ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGGACGAGAG CTTGAACAGA 1440 
AATCACTTCC AGTCTTACAT CATGGCTG AC ATGTATAGTT TTGGCCTCAT CCTTTGGGAG 1500 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
^ CGCCCCTCAT TCCCAAACCG GTGG AGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 

35 ATGACAGAAT GCTGGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GGTTAAGAAA 1740 
ACACTTGCCA AAATGTCAGA GTCCCAGGAC ATTAAACTCJJSATAGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA I860 
TAAGCATCCA CAGTACAAGC CTTGA ACATC GTCCTGCTTC CCAGTGGGTT CAGACCTCAC 1920 
. _ CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 1980 
40 TCTGTTTGTA GGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 

SEQ ID HO:183 BCY2 Protein sequence 

Protein Accession #: NP.001194 

45 

111 21 31 41 51 
I 1 I I I I 

MLLRSAGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMIEED 60 
„ DSGLPWTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 
50 GPIHHRALLI SVTVCSLLLV LIILFCYFRY KRQETRPRYS IGLEQDETY1 PPGESLRDLI 180 

EQSQSSGSGS GLPLLVQRT1 AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 240 
WFRETEIYQT VLMRHENILG FIAADDCGTG SWTQLYLTTD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSSVS GIjCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGTCCIAD LGLAVKHSD 360 
„ TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQS YIMADM YSPGLDLWEV ARRCVSGGIV 420 
55 EEYQLPYHDL VPSDPSYEDM REIVCnCKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



60 SEQ ID NO:184 CBF9 DNA sequence 

Nucleic Add Accession*: AC005383 

Coding Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 
65 1 11 21 31 41 51 

i 1 I I I I 

GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTGGGTGACC CGCGTAGAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

70 ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

GTTTTCCTGT TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAOTGGAC 480 

75 ATCATGTTTC TGTTAGATGG GTCTAACAGC GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 
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CAGGAAGTGA AGGCAAGAAT CAAGAGGATG GTTTTCAAAG GAGGGCGCAC GGAGACGGAA 720 

CTTGCTCTGA AATACCTTCT GCACAGAGGG TTGCCTGGAQ GCAGAAATGC TTCTGTGCCC 780 

CAGATCCTCA TCATCGTCAC TGATGGGAAG TCCCAGGGGG ATGTGGCACT GCCATCCAAG 840 

CAGCTGAAGG AAAGGGGTGT CACTGTGTTT GCTGTGGGGG TCAGGTTTCC CAGGTGGGAG 900 

5 GAGCTGCATG CACTGGCCAG CGAGCCTAGA GGGCAGCACG TGCTGTTGGC TGAGCAGGTG 960 

GAGGATGCCA CCAACGGCCT CTTCAGCACC CTCAGCAGCT CGGCCATCTG CTCCAGCGCC 1020 

ACGCCAGACT GCAGGGTCGA GGCTCACCCC TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 

GAGTTCGCTG GCAATGCCCC ATGCTGGAGA GGATCGCGGC GGACCCTTGC GGTGCTGGCT 1140 

GCACACTGTC CCTTCTACAG CTGGAAGAGA GTGTTCCTAA CCCACCCTGC CACCTGCTAC 1200 

.0 AGGACCACCT GCCCAGGCCC CTGTGACTCG CAGCCCTGCC AGAATGGAGG CACATGTGTT 1260 

CCAGAAGGAC TGGACGGCTA CCAGTGCCTC TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC 1320 

TGTGCCCTGA AGCTGAGCCT GGAATGCAGG GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCGGGCACCA CTCTGGACGG CTTCCTGCGG GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG 1440 

GCCGTGCTGA GCGAGGACTC TCGGGCCCGA GTGGGTGTGG CCACATACAG CAGGGAGCTG 1500 

.5 CTGGTGGCGG TGCCTGTGGG GGAGTACCAG GATGTGCCTG ACCTGGTCTG GAGCCTCGAT 1560 

GGCATTCCCT TCCGTGGTGG CCCCACCCTG ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG 1620 

CGTGGCTTCG GGAGCGCCAC CAGGACAGGC CAGGACCGGC CACGTAGAGT GGTGGTTTTG 1680 

CTCACTGAGT CACACTCCGA GGATGAGGTT GCGGGCCCAO CGCGTCACGC AAGGGCGCGA 1740 

GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG GCCGTGCGGG CAGAGCTGGA GGAGATCACA 1800 

JO GGCAGCCCAA AGCATGTGAT GGTCTACTCG GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

GAGCTGCAGQ GGAAGCTGTG CAGCCGGCAG CGGCCAGGGT GCCGGACACA AGCCCTGGAC 1920 

CTCGTCTTCA TGTTGGACAC CTCTGCCTCA GTAGGGCCCG AGAATTTTGC TCAGATGCAG 1980 

AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

CTGGTGGTGT ATGGCAGCCA GGTGCAGACT GCCTTCGGGC TGGACACCAA ACCCACCCGG 2100 

15 GCTGCGATGC TGCGGGCCAT TAGCCAGGCC COCTACCTAG QTGGGGTGGG CTCAGCCGGC 2160 

ACCGCCCTGC TGCACATCTA TGACAAAGTG ATGACCGTCC AGAGGGGTGC CCGGCCTGGT 2220 

GTCCCCAAAG CTGTGGTGGT GCTCACAGGC GGGAGAGGCG CAGAGGATGC AGCCGTTCCT 2280 

GCCCAGAAGC TGAGGAACAA TCGCATCTCT GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA 2340 

AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC CGGGATTCCC TGATCCACGT GGCAGCTTAC 2400 

JO GCCGACCTGC GGTACCACCA GGACGTGCTC ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG 2460 

CCAGTCAACC TCTGCAAACC CAGCCCGTGC ATGAATGAGG GCAGCTGCGT CCTGCAGAAT 2520 

GGGAGCTACC GCTGCAAGTG TCGGGATGGC TGGGAGGGCC CCCACTGCGA GAACCGTGAG 2580 

TGGAGCTCTT GCTCTGTATG TGTGAGCCAG GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 

55 GGCACTGAAA TGGTGCCTAC CTTCTGGAAT GTCTGTGCCC CAGGTCCTTA G AATGTCTGC 2760 

TTCCCGCCGT GGCCAGGACC ACTATTCTCA CTGAGGGAGG AGGATGTCCC AACTGCAGCC 2820 

ATGCTGCTTA GAGACAAGAA AGCAGCTGAT GTCACCCACA AACGATGTTG TTGAAAAGTT 2880 

TTGATGTGTA AGTAAATACC CACTTTCTGT ACCTGCTGTG CCTTGTTGAG GCTATGTCAT 2940 

CTGCCACCTT TCCCTTGAGG ATAAACAAGG GGTCCTGAAG ACTTAAATTT AGCGGCCTGA 3000 

\0 CGTTCCTTTG CACACAATCA ATGCTCGCCA GAATGTTGTT GACACAGTAA TGCCCAGCAG 3060 

AGGCCTTTAC TAGAGCATCC TTTGGACGGC GAAGGCCACG GOCTTTCAAG ATGGAAAGCA 3120 

GCAGCTTTTC CACTTCCCCA GAGACATTCT GGATGCATTT GCATTGAGTC TGAAAGGGGG 3180 

CTTGAGGGAC GTTTGTGACT TCTTGGCGAC TGCCTTTTGT GTGTGGAAGA GACTTGGAAA 3240 

GGTCTCAGAC TGAATGTGAC CAATTAACCA GCTTGGTTGA TGATGGGGGA GGGGCTGAGT 3300 

1-5 TGTGCATGGG CCCAGGTCTG GAGGGCCACG TAAAATCGTT CTGAGTCGTG AGCAGTGTCC 3360 

ACCTTGAAGG TCTTC 

?PQ m WO:18S CBF9 Protein sequence 
Protein Accession #: none found 

50 

1 11 21 31 41 51 

| | ( I I I 

MPPFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA SKHMWCSAAV DIMFLLDGSN 60 

55 SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQPSSTPH LBFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHALASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEHV REFAGNAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGFL 360 

50 RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QUVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RELLLLGVGS -480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAISQ 600 

APYIX3GVGSA GTALLHIYDK VMTVQRGARP GVPKAWVLT GGRGAEDAAV PAQKLRNNGI 660 

65 SVLWGVGPV LSEGLRRLAG PRDSLIHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HHAPVQEGSS 780 

RTPPSNYREG LGTEMVPTFW NVCAPGP 



70 SEQ ID NO:186 PAV1 DNA sequence 

Nucleic Acid Accession #: AF272890 ' 

Coding Sequence: 87-1520 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

75 | I i | I I 

TGCTACCCGC GCCCGGGCTT CTGGGOTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 
CCCGCCCCCG GCCTCCGCAG CTCGGCATGG GCGCGGGGGT GCTCGTCCIG GGCGCCTCCG 120 
AGCCCGGTAA CCTGTCGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 180 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TGCTGGTGCC 
CGCTGTCTCA 
TCGTGGCGGG 
TCACCAACCT 
TGCCGTTCGG 
AGCTGTGGAC 
TTGCCCTGGA 
GCGCGCGGGC 
TGCCCATCCT 
ACCCCAAGTG 
CCTTCTACGT 
AGAAGCAGGT 
CGCCCTCGCC 
CCGCCGCCGC 
CGCGCCTCGT 
TCTTCACGCT 
AGCTGGTGCC 
TCAACCCCAT 
GCTGCGCGCG 
CGGGCTGTCT 
ACGACGATGT 
ACGGCGGGGC 
CCTCGGAATC 
GGGAACGAGG 
CCTCGTCTGA 
TTTGGGAAGG 



KGAGVLVLGA 
MGLLMALIVL 
WGRWEYGSFF 
TVWAISALVS 
AFVYLRVFRE 
. LANGRAGKRR 
FFNWLGYANS 
PPSPGAASDD 



CGCGTCGCCG 
GCAGTGGACA 
CAATGTGCTG 
CTTCATCATG 
GGCCACCATC 
CTCAGTGGAC 
CCGCTACCTC 
GCGGGGCCTC 
CATGCACTGG 
CTGCGACTTC 
GCCCCTGTGC 
GAAGAAGATC 
CTCGCCCTCG 
CGCCGCCACC 
GGCCCTACGC 
CTGCTGGCTG 
CGACCGCCTC 
CATCTACTGC 
CAGGGCTGCC 
GGCCCGGCCC 
CGTCGGGGCC 
GGCGGCGGAC 
CAAGGTGTAQ 
AGATCTGTGT 
ATCATCCGAG 
GATGGGAGAG 



H 
I 

SEPGNLSSAA 
LIVAGNVLVI 
CELWTSVDVL 
FLPILMHWWR 
AQKQVKKIDS 
PSRLVALREQ 
AFNPIIYCRS 
DDDDWGATP 



CCCGCCTCGT 
GCGGGCATGG 
GTGATCGTGG 
TCCCTGGCCA 
GTGGTGTGGG 
GTGCTGTGCG 
GCCATCACCT 
GTGTGCACCG 
TGGCGGGCGG 
GTCACCAACC 
ATCATGGCCT 
GACAGCTGCG 
CCCGTCCCCG 
GCCCCGCTGG 
GAGCAGAAGG 



CGCAGCCCCG 
CGCCGGCGCC 
GGACCCCCGC 
ACGCCGCCCG 
AGCGACTCGA 
GGCCCGGCGC 
TTACTTAAGA 
GCAAAGAGAA 
TGGCTTGCTG 



TGCTGCCTCC 
GTCTGCTGAT 
CCATCGCCAA 
GCGCCGACCT 
GCCGCTGGGA 
TGACGGCCAG 
CGCCCTTCCG 
TGTGGGCCAT 
AGAGCGACGA 
GGGCCTACGC 
TCGTGTACCT 
AGCGCCGTTT 
CGCCCGCGCC 
CCAACGGGCG 
CGCTCAAGAC 
TGGCCAACGT 
TCAACTGGCT 
ACTTCCGCAA 
ACGCGACCCA 
CATCGCCCGG 
CGCGCCTGCT 
GCCTGGACGA 
GGGGCGCGGA 
CCGATAGCAG 
AAGCCACGGA 
ATGTTCCTTG 



CGCCAGCGAA 
GGCGCTCATC 
GACGCGGCGG 
GGTCATGGGG 
GTACGGCTCC 
CATCGAGACC 
CTACCAGAGC 
CTCGGCCCTG 
GGCGCGCCGC 
CATCGCCTCG 
GCGGGTGTTC 
CCTCGGCGGC 
GCCQCCCGGA 
TGCGGGTAAG 
GCTGGGCATC 
GGTGAAGGCC 
GGGCTACGCC 
GGCCTTCCAG 
CGGAGACCGG 
GGCCGCCTCG 
GGAGCCCTGG 
GCCGTGCCGC 
CTCCGGGCAC 
GTGAACTCGA 
CCGTTGCACA 
TTG 



AGCCCCGAGC 
GTGCTGCTCA 
CTGCAGACGC 
CTGCTGGTGG 
TTCTTCTGCG 
CTGTGTGTCA 
CTGCTGACGC 
GTGTCCTTCC 
TGCTACAACG 
TCCGTAGTCT 
CGCGAGGCCC 
CCAGCGCGGC 
CCCCCGCGCC 
CGGCGGCCCT 
ATCATGGGCG 
TTCCACCGCG 
AACTCGGCCT 
GGACTGCTCT 
CCGCGCGCCT 
GACGACGACG 
GCCGGCTGCA 
CCCGGCTTCG 
GGCTTCCCAG 
AGCCCACAAT 
AAAAGGAAAG 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



Protein Accession #: 



SEQ ID NO:187 PAV1 Protein sequence 
AA011176 



21 

I 

PLPDGAATAA 
VAIAKTPRLQ 
CVTASIETLC 
AESDEARRCY 
CERRFLGGPA 
KALKTLGIIM 
PDPRKAFQGL 
PARLLEPWAG 



31 
I 

RLLVPASPPA 
TLTNLFIMSL 
VIALDRYLAI 
NDPKCCDFVT 
RPPSPSPSPV 
GVFTLCWLPF 
LCCARRAARR 
CNGGAAADSD 



41 
I 

SLLPPASESP 
ASADLVMGLL 
TSPFRYQSLL 
NRAYAIASSV 
PAPAPPPGPP 
FLANWKAFH 
RHATHGDRPR 
SSLDEPCRPG 



51 
I 

EPLSQQWTAG 
WPFGATIW 
TRARARGLVC 
VSFYVPLCIM 
RPAAAAATAP 
RBLVPDRIiFV 
ASGCLARPGP 
FASESKV 



60 
120 
180 
240 
300 
360 
420 



SEOroNOilflBBCQgDNAsecuencfl 
Nucleic Acid Accession*: AJ400877 
Coding sequence: 



81-3080 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I 1 I I i 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
CCGCAACCGC TGAGCCATCC ATGG GGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTGCTGCTG CTGCTGCXGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 180 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCG ACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTGAGGACAT CGATGAATGT GGAAATGAGC 360 
TCAATGGAGG CTGTGTCCAT GACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTGATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 
AGAACAATGQ CGGCTGCCAG CATACCTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGAATAAG GATCACGGCT GTAGTCACAT CTGCAAGGAG GCCCCAAGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTO GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAA ACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GG AGGCTGTG 960 
ACCGCACCTG TAAGGATACT TCGACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGGATTTA 1 140 
AATTATTAAC AG ATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1260 
CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAGAT 1500 
GTCACTCTGG CATTCACCTC TCTTCAG ATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT COGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTGAAACTA ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGAGCTGCAT CCTAAAGCGA ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 
AGGCCGTCCA CAGGGAGCAG TTTCACCTCC AGCTCTCAGG CATGAACCTC GACGTGGCTA 1920 
AAAAGCCTCC CAGAACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAGAAAACCA ATGTGTCAGT TGCAGGGCTG GGACCTATTA TGATGGAGCA CGAGAACGCT 2040 
GCATTTTATG TCCAAATGGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTGAACCAT 2100 
GCCCAAGACC AGGAAATTCT GGGGCCCTGA AGACCCCAGA AGCTTGGAAT ATGTCTGAAT 2160 
GTGG AGGTCT GTGTCAACCT GGTGAATATT CTGCAG ATGG CTTTGCACCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTGAAGCTO GTCGAACTTC CTGCTTCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GGAGCTACTT CCTTTCACG A CTGTGAAACC AG AGTTCAAT 2340 
GTTCACCTGG ACATTTCTAC AACACCACC A CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TGAATTTGGA AAAAATAATT GTGTTTCTTG CCCAGGAAAT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAGATTTCAC TGGGTACATT OAATCCCCAA ACTACCCAGG C AATTACCCA GOCAACACCG 2580 
AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCCTGAG A 2640 
TCTTCCTGCC CATAGAGGAC GACTGTGGGG ACTATCTGGT GATGCGGAAA ACCTCTTCAT 2700 
CCAATTCTGT GACAACATAT GAAACCTGCC AGACCTACGA ACGCCCCATC GCCTTCACCT 2760 
CCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATGA AGGG AACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATACGTGACA TATG ATGAGG ACTACCAGGA ACTCATTGAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAGAAOC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATGTC CTGGCCCATC CCCAG AACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCCGAGA GATGTTTCCA AGATCGTTCA TCCGATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TTTTGAGACC TTACAAATGA CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATTAG AGTTCAATTT TTATAG ATAA TACAGATATT TTGGTAAATT 3240 
GAACTTGGTT TTTCTTTCCC AGCATCGTGG ATGTAGACTG AGAATGGCTT TO AGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GGATAGATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGGAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
CCGGCCCTCT CTAAGGG AGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3540 
CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG ACCTGGG AGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTTG ATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTGGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ \D HO-.m BC02 Protein sequence 

Protein Accession*; CA&92285 



1 11 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAWAVLLLLL LLPPLLLLAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPGYQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEEGLSCMNK 180 
DHGC5HICKE APRGSVACEC RPGFELAKNQ RDCILTCNHG NGGCQH5CDD TADGFECSCH 240 
PQYKMHTDGR SCLEREDTVL EVTESNTTSV VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSEDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSCIN HPGTFACACN RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 480 
SSDVTT1RTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 
PGAPGRPSTP KEMHTVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA REROLCPNG 660 
TPQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYSADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGY1 ESPNYPGNYP ANTECTWTIN 840 
PPPKRREUV VPEIFLPIED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSNEGNS ARGPQVPYVT YDEDYQEUE DIVRDGRLYA SENHQEILKD KKUKALFDV 960 
LAHPQNYFKY TAQESREMFP RSFIRLLRSK VSRFLRPYK 

SEQ ID NCMSO BFG1 ONA sequence 

NudeJc Add Accession «: AF097170 

Coding sequence: 1-1725 (underlined sequences correspond lo slop codon) 

I U 21 31 41 51 
I I I I I I 

AAGGAGGCGG CCTCCGGGAA AAGCG ACCGC AGGACTCCTG AGAGCAGCCT CCATGAGGCC 60 
CTGGACCAGT GCATG ACCGC CCTGGACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 1 20 
AGCTACCTCA AGCCCAGAAC CAAGGAAAGC ATGTACCACT CACTGACATA TGCCACCATC 180 
CTGGAGATGC AGGCCATGAT GACCTTTGAC CCTCAGGACA TCCTGCTTGC CGGCAACATG 240 
ATGAAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTGAAGAAGA AATCCACGCT 360 
GAGGTCTGCT ATGC AG AGTG CCTGCTGC AG CGAGCAGCCC TGACCTTCCT GCAGGACG AG 420 
AACATGGTG A GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGGACA GCCTTGTTCA OTCCTCACAA TACTGCAAGG GTOAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TGAAGCTTGG TOTAGCGGC C TTCA ACCTGA CACTGTCCAT GCTTCCTACT 600 
AGGATCCTG A GGCTGTTGGA GTTTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGG AGG AGGGAGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTGTGT CATGCTCCTG 720 
CTGTGCTACC AC ACCTTCCT CACCTTCGTG CTCGGTACTG GG AACGTCAA CATCG AGG AG 780 
GCCGAGAAGC TCTTGAAGCC CTACCTGAAC CGGTACCCTA AGGGTGCCAT CTTCCTGTTC 840 
TTTGCAGGGA GGATTGAAGT CATTAAAGGC AACATTGATG CAGCCATCCG GCGTTTCGAG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGG AAG CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCT TCACCTACAA GGGCCAGTGG AAGATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGGAGA ACTGCTGGTC CAAGGCCACC TACATTTACA TGAAGGOCGC CTACCTCAGC 1080 
ATGTTTGGGA AGGAGGACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCGAGCT 1 140 
GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAG AA GTTTGCCATC 1200 
CGGAAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTGCC TGCTCTGGAA 1260 
ATGATGTACA TCTGGAACGG CTACGCCGTG ATTGGGAAGC AGCCGAAACT CACGGATGGG 1320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTGGAGA AAGGCCCAGA GAACGAGTAC 1380 
TCAGTGGATG ACGAGTGCTT GGTGAAATTG TTGAAAGGCC TGTGTCTGAA ATACCTGGGC 1440 
CGTGTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATCCC AAACGCCCTG CTGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAG AGGCCAT CAAACTTTTG GAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGGACACA CTTTCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TGIAQCTTTG TGCAGCAGTT 1740 
CCGGGCTGGA AGACAGAGAC AGCTGGACAG AGCTGCTGAA AACATTTCAA AATACXXCCT 1800 
CCCCCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGG CACAACATAG 1860 
TGTATCCGTG CAGAAGCCGA GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGCCAAG 1920 
GGCAGAGCAG GTGGAGCCCT CTGCCTGCCC TATCACACAT ACGGGTACTT GCTTTTCACT 1980 
GTGATGTTTA AGAGAATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTGATGGGAT 2040 
CACAGTTGGC TTTAAAAACC AACAACAATC AACCACCTGT AAGTCTTTGT CITCACCTAT 2100 
TATCATCTGG AGGTAAATCT CTTTATATGA TGATGCCAAA GGGCAAATTG CTTTTCAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTG ACGGAAG GTCCTTCAGA GGACCTGAGG AATGCCTGGG 2220 
AGAGGCTAAG CCTCAGGCTT CAATGCTTCT GGGGTTGGGC ATGAGGATGT ACACAGACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTCA CTCCTTTTGT AAATTTCCAA TTTAAAAATC 2340 
AAGCACGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAGAAAAATC AATCTCTACC 2400 
AGTAGAAAAT GCCAGGGCTT GATGGAAGAG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAACCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTGTTAAA ACTGTTATTC TTGAAAAAAA AAAAAAAAAA 2640 
AA 

SEOIDNO:191 BFG1 Protein sequence 

Protein Accession!: AAC39582 



1 11 21 31 41 51 
I I I 1 I I 

MTALDLFLTN QFSEALS YLK PRTKESMYHS LTYATILEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SSVTDSFSSL VNRPTLGQFT EEEIHAEVCY AECLLQRAAL TFLQDENMVS 120 
FIKGGIKVRN SYQTYKELDS LVQSSQYCKG ENHPHFEGGV KLGVGAFNLT LSMLPTRILR 1 80 
LLEFVGFSGN KDYGLLQLEE GASGHSFRSV LCVMLLLCYH TFLTFVLGTG NVNffiEAEKL 240 
LKFYLNRYPK GAIFLFFAGR IEVDCGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYTYM KAAYLSMFGK EDHKPFGDDE VELFRAVPGL 360 
KLKIAGKSLP TEKFAIRKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDGILEI 420 
ITKAEEMLEK GPENEYSVDD ECLVKLLKGL CLKYLGRVQE AEENFRSISA NEKKIKYDHY 480 
UPNAULELA LLLMEQDRNE EAIKLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



SEQ fD MO:192 BFQ6 DNA sequence 

Nudeic Add Accession #: NM.0329B3 

Coding sequence: 1-4044 (underlined sequences correspond to start and stop coders) 

1 11 21 31 41 51 
I I I I 1 I 

ATGACTAGGA AGAGG ACATA CTGGGTGCCC AACTCTTCTG GTGGCCTCGT GAATCGTGGC 60 
ATOGACATAG GCGATGACAT GGTTTCAGGA CTTATTTATA AAACCTATAC TCTCCAAGAT 120 
GGCCCCTGG A GTCAGCAAG A GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACCG 180 
TGGGGGAAOT ATGATGCTGC CTTGAGAACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 
CCTGCCCCCC AGCCCCTGOA CAATGCTGGC CTGTTCTCCr ACCTCACCGT GTCATGGCTC 300 
ACCCCGCTCA TGATCCAAAG CTTACGGAGT CGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTCCATG ATGCCTCAGA CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GGAAGAAGAA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTOCAGAGA 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAG TGTACTCGGG 540 
CCAATATTGA TTATACCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGGAGTGG GACTCTGCTT TGCC Cl ' I ' l i 1 CTCTCCGAAT GTGTGAAGTC TCTGAGTTTC 660 
TCCTCCAGTT GO ATCATCA A CCAACGCACA GCCATCAGGT TCCGAGCAGC TGTTTCCTCC 720 
TTTGCCnTG AGAAGCTCAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCACCGGTGA TGTAAACTAC CTGTTTGAAG GGGTGTGCTA TGGACCCCTA 840 
GTACTGATCA CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTGGA 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
ACAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTGAGG TCAGCGACCA GCGCATCCGT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTGATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTG AAGG TATGG AAAGT CTGACTTTCT GCTCCAAACC TGGTGATGGC 1140 
ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATICCAAG TCTGCAGTGA TGAGGTTCAA GAAGTITTTC 1260 
CTCCAGOAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTOTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGG AGA GGAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 



380 
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CCAGAGGAAG AAGGGAACAG CCTGGGCCCA G AGTTGCACA AGATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TGTTAGGGGT CTGCGGCAAC ACGGGGAGTO GTAAG AGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAGATGCA CTTGCTCGAG GGCTCGGTGG GGGTGCAGGO AAGCCTGGCC 1620 
TATGTCCCCC AGCAGGCCTG G ATCGTCAGC GGG AAC ATC A GGG AG AACAT OCTCATGGG A 1680 
5 GGCGCATATG ACAAGGCCCG ATACCTCCAG GTGCTCCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTTC TGCCCTTTGG AGACATGACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1800 
GGGGGGCAGA AACAGAGGAT CAGCCTGGCC CGCGCCGTCT ATTCCGACCG TCAGATCTAC 1860 
CTGCTGGACG ACCCCCTGTC TGCTGTGG AC GCCCACGTGG GGAAGCACAT TTTTGAGGAG 1 920 
1 rt TGCATTAAGA AGACACTCAG GGGGAAGACG GTCGTCCTGG TGACCCACCA GCTGCAGTAC 1980 
1 U TTAGAATTTT GTGGCCAG AT CATTTTGTTG OAAAATGGGA AAATCTGTGA AAATGGAACT 2040 
CACAGTGAGT TAATGCAGA A AAAGGGG AAA TATGCCCAAC TTATCCAGAA GATGCACAAG 2100 
GAAGCCACTT CGGACATOTT GCAGGACACA GCAAAGATAG CAGAGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACCTC CCTGGAAG AG TCTCTCAACG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AGGAGG AGGA G ATGGAAGAA GGCTCCTTGA GTTGGAGGGT CTACCACCAC 2280 
1 J TACATCCAGG CAGCTGGAGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATCGTCTTCT TAACG ATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGGA GCAGGGCTCG 2400 
GGGACCAATA GCAGCCGAGA GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTC AAC TGTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGCCCTGCT CCTCATCTGT 2520 
GTGGGGGTCT GCTCCTCAGG G ATTTTC ACC AAAGTCACGA GGAAGGCATC CACGGCCCTG 2580 
20 CACAACAAGC TCTTC AACAA GGTTTTCCGC TGCCCC ATGA GTTTCTTTGA CACCATCCCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGCAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCGCCGTCCT GTTGATTGTC 2760 
AGTGTGCTGT CTCCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATGA TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 
25 TCTCCTTTAT TCTCCCACAT CCTCAATTCT CTGCAAGGCC TGAGCTCCAT CCATGTCTAT 2940 

GGAAAAACTG AAGACTTCAT CAGCCAGTTT AAGAGGCTGA CTGATGCGCA GAATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACGATGG ATGGCATTGA GOCTGGAGAT CATGACCAAC 3060 
CTTGTGACCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTCCTCCAC CCCCTACTCC 3120 
„ TTTAAAGTCA TGGCTGTCAA CATCGTGCTG CAGCTGGCGT CCAGCTTCCA GGCCACTGCC 3180 
30 CGGATTGGCT TGGAGACAGA GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
CCACAGCATG GGGAAATCAT ATTTCAGGAT TATCACATGA AATACAGAGA CAACACACCC 3360 
ACCGTGCTTC ACGGCATCAA CCTOACC ATC CGCGGCCACG AAOTGGTGGG CATCGTGGGA 3420 
_ - AGG ACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GGAGCCCATG 3480 
35 GCAGGCCGGA TTCTCATTGA CGGCGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGGAACCAT CAGATTCAAC 3600 
CTAGATCCCT TTGACCGTCA CACTGACC AG CAGATCTGGG ATGCCTTGGA GAGGACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
rt GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 
40 TCCAAGATCA TCCTTATCGA TG AAGCCACA GCCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTGA AGCCTTCCAG GGCTGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCACCACTG TGCTG AACTG TG ACCAC ATC CTGGTTATGG GCAATGGGAA GGTGGTAG AA 3960 
TTTGATCGGC CGG AGGTACT GCGG AAG A AG CCTGGGTC AT TOTTCGCAGC CCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG AJAAGGAGAT GTGGAGACTT CATGGAGGCT GGCAGCTGAG 4080 
CTCAG AGGTT CACAC AGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTGG AG 4140 
ATGAGAACTT CTCCTGG AAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
ACCCTGGAAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACCCCAGAAC CATCTAAGAC 4260 
ATGGG ATTCA GTGATCATGT GGTTCTCCTT TTAACTTACA TGCTG AATAA TTTTATAATA 4320 
- rt AGGTAAAAGC TTATAGTTTT CTGATCTGTG TTAGA AGTGY TGCAAATGCT GTACTGACTT 4380 
50 TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

S5Q IP Wrt w PFW Proton seweorc 

Proldn Accession* NP_1 15972.1 

55 1 11 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG UYKTYTLQD GPWSQQERNP EAPGRAAVPP 60 
WGKYDAALRT MIPFRPKPRF PAPQPLDNAG LFSYLTVS WL TPLMIQS LRS RLDENTIPPL 120 
SVHDASDKNV QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRLIFDALLG ICFCIAS VLG 180 

60 PUJIPiaLE YSEEQLGNW HGVGLCFALF LSECVKSLSF SSSWUNQRT AIRFRAAVSS .240 
FAFEKUQFK SVIHTTSGEA ISFFTGDVNY LFEGVCYGPL VUTCASLVI CSISSYFTIG 300 
YTAFIAILCY LLVFPLAVFM TRMAVKAQHH TSEVSDQRIR VTSEVLTCIK LIKMYTWEKP 360 
FAKUEGMES LTFCSKPGDG MAFSMLASLN LLRLSVFFVP IAVKGLTNSK S AVMRFKKFF 420 
LQESPVFYVQ TLQDPSKALV FEEATLSWQQ TCPGWXGAL ELERNGHASE GMTRPRDALG 480 

65 PEEEGNSLGP ELHKINLVVS KGMMLGVCGN TGSGKSSLLS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWIVS GN1RENILMG GAYDKARYLQ VLHCCSLNRD LELLPFGDMT HGERGLNLS 600 
GGQKQRISLA RAVYSDRQIY LLDDPLSAVD AHVGKHIFEB CIKKTLRGKT WLVTHQLQY 660 
LEFCGQIUJL ENGKICENGT HSELMQKKGK YAQLIQKMHK EATSDMLQDT AK1AEKPKVE 720 

„ SQALATSLEE SLNGNAVPEH QLTQEEEMEE GSLSWRVYHH YIQAAGGYMV SCOFFFWL 780 

70 IVFLTIFSFW WLSYWLEQGS GTNSSRESNG TMADLGNIAD NPQLSFYQLV YGLNALLUC 840 
VGVCSSGOT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP IGRLLNCFAG DLEQLDQLLP 900 
IFSEQFLVLS LMVIAVLUV SVLSPYILLM GAHMVICH YYMMFKKA1G VFKRLENYSR 960 
SPLFSHILNS LQGLSSIHVY GKTEDF1SQF KRLTDAQNNY LLLFLSSTRW MALRLEIMTN 1020 
LVTLAVALFV AFGISSTPYS FKVMAVNIVL QLASSFQATA RIGLETEAQF TAVERILQ YM 1080 

75 KMCVSEAPLH MEGTSCPQGW PQHGEUFQD YHMKYRDNTP TVLHGINLTI RGHEVVGIVG 1140 
RTGSGKSSLG MALFRLVEPM AGRHJDGVD ICSIGLEDLR SKLSVIPQDP VLLSGTIRFN 1200 
LDFFDRHTDQ QIWDALERTF LTKAISKFPK KLHTDVVENG GNFSVGERQL LClARAVtRN 1260 
SKHUDEAT ASIDMETDTL IQRTIREAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK POSLFAALMA TATSSLR 
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SEQ ID NO:194 BHBB PNA sequence 
Kucielc Acid Accession #: 
Coding sequence: 



AAS83251 

1-1749 (underlined sequences correspond to start and stop codons) 



l 



11 



21 



31 



51 



ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GGTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGACCG CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGCCT GGATCCGCGC CCAGCAGCAG 240 

CCGCGGCCGC CGCCAGCTGG GCAGGCTCCC GGGACTGOGG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC CTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCCGGGG GCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGG CAGTCCCTAA GGGGACCGGG CCACCGGCTG AGGACGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGCGGCAGA GGGGAGTGGC 540 

CCGCGCGGAA AGCGOCGCGG GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTTCTCGGAG ACCGTCCTGC GCTCTCTGGA GACGCGCTGT CCGCGCCCAG GGTGGTGCCA 660 

TGTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATCCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGACGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGGTGCGCC CCCACCCGTG AGGGCCTGGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCAGCGGCC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

6GGGCCCGGG CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT ACGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATG 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

TTTTTTTTTT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATGTTGGCTG GGCTGGTCTC ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCTA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTGQ 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 2820 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTTGCCTTG 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GCCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACTTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATGG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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SEP, to NQilW BHM Proton gwetw 

Protein Accession #: none found 



MLSGFLMSPS 
GEAEKGNRGE 
SGRQPRGPSD 
PRGKRRGTVS 
CCWLRCWRRG 
ARLDQGGCDN 
SPKGRQLLRA 
LVAACCCRCL 
GARAPPTRSQ 
HDSVPMTAVP 



11 

I 

TQHRAQYTPG 
PPAWIRAQQQ 
CIPRFPSASA 
DEAEGSPGPR 
RGPSGEYCHG 
DRQQGAGEPG 
FPGLLPRARR 
RPKQDPQQSR 
TNCCLPEGTM 
PFMDGLQPQY 



21 
I 

GKKLPWEASI 
PRPPFAGQAP 
THKAVPKGTG 
LM3DRPALSG 
WLDAQGVWRI 
RADKDGPRRL 
RGFPSSPRGG 
AFGGNRLMET 
NNVYVNMPTN 
RQIQSPFPHT 



31 

I 

GAHTSRGRGS 
GTAAGGAQDP 
PPAEDGDGIX3 
DALSAPRWP 
GFQCPERFDG 
GRASCLRGTQ 
PSPLQRPALP 
IPMIPSASTS 
FSVLNCQQAT 
NSEQKMYPAV 



41 

1 

DRERESRPEA 
RLRPGRSRGR 
APGPRARRRR 
CGALAARPSP 
GDATICCGSC 
GDGEGAPPPV 
IYVPFLIVGS 
RGSSSRQSST 
QIVPHQGQYL 
TV 



51 

I 

AGLLWDRAAA 
VRLPVKPPEA 
LLGVAAE6S6 
HPGTPLRSCS 
ALRYCCSSAE 
RAWQRCSPEG 
VFVAFIILGS 
AASSSSSANS 
HPPYVGYTVQ 



60 
120 
180 
240 
300 
360 
420 
480 
540 



Nudge Acid Accession S: 



AA088458 
Coding sequence: 



8EQ ID N0:1 96 CQA5 DMA SEQUENCE 



882-1995 (underlined sequences correspond to start and stop codons) 



GCCCTTGGAC 
CTGAAGAAAA 
GCGCGGGGCC 
CTGGGCCAGA 
CGGCTACTGC 
TGTGCCAGCC 
ACCTCACCCC 
CTCACCCAGG 
GCGCTCATTA 
GATTCCACCT 
AGCCCTTCGA 
GCCCAGGCAC 
GCCTGCCCCC 
ACATGGGCTG 
TGGACAGTGG 
GGTCCCATCT 
AGAGGGCGCG 
CAGGACGAGG 
G7AAGCGGGG 
CTGGCCAAGG 
GGCCTGCATG 
TTGCCCCAOG 
GACAGCTCCC 
CTGGGGTCCT 
GAGAGGCCAC 
GGCAGGTCCC 
GGAGTACGCA 
GAACCAGGGG 
TCAGTGTGTG 
CCCGATGCGG 
ACACTGTCCC 
CCTTCCGGAG 
TGCTGCACCT 
GCCCTCCTAC 
ACCTCCTGGG 
AGGTGGACTG 
GGCTGGGGTC 
TGGGGGATCC 
GGTGACTTCA 
GAGACAGGCT 
AAAGAAATAG 
CACGAGGGGA 
GCAGACCCTG 
GAGCAGCGTC 
GCGTGCACAC 
CAGAAGTGTC 
TTTTGTGTTG 
CTGGAATCCC 
CCCCATCTCT 
TCATAAACAC 
TAGACCCAGA 
AGAAATAAAA 



11 

I 

ACTGACATGG 
AGGAGCTGGA 
GOGACTGGTA 
GCAGAGCCAG 
CCAAGGTACA 
GGGCCCTGCC 
CGGTCTGGCA 
AGGTGACCGA 
AGCAGCTGTT 
TCATCTAGTC 
GGGTGGGCGC 
AGTCCCGGAG 
GGCTGGTCCC 
GGGGCTCTCT 
GGTACCCCTC 
TCAGGGAAAG 
GGGCGGCTCC 
TGGCTGTAGC 
GGTGCCTGCC 
CTGAGGGACC 
TGCCTCCCAC 
TTGAGTCCCA 
AGGCACGTCA 
GCTCACCCCC 
CTCCCTCAGC 
CTTGGGTGTC 
CTGGTGGGGG 
CACGGCAACA 
TGGGGCGCAG 
GGTCAGTGCG 
ACAAGGCACC 
CCCAGCTCCA 
GGTCTGCAGG 
CCTGAAGATG 
CAGGAAAGGG 
CAGCGCAGTG 
TGCCCACCAG 
TGGCATCTTT 
TCAGGAGAOC 
GGCACCTCCG 
GTCCTCCCAG 
GAATTTAAAG 
CCTGGAGCCT 
CCTGGGCTCT 
TGTGATGACA 
CCCAGTTGAG 
ATCAAGTTCC 
AGCACTTGAG 
ACAARAAAAA 
CACAAGGAAA 
TACTAGAATT 
GAGATTTCTG 



21 
I 

ACTGAAGGAG 
GCAGGAGAAG 
CCAGCAGCAG 
CGCCGACTTT 
AGAGGTGGCC 
CCCGTCCTCC 
GCAGCAGACC 
GAAGAGTGAG 
TGAGGCCCGC 
CTTGTGGGCC 
CCCATCGCAC 
TGGGCGCCTT 
CGCACCGAGC 
TGAGTCCGCA 
C ATG AGTTAG 
GCACTGCCCA 
GACGOGGGTC 
TCGGACGGAC 
TGGCTGGGGA 
CTGGCTGCAG 
AGACCCTGGG 
CACAACATCC 
TAGGCAAAGC 
CTTTGCTCTC 
CAAOGAAAAC 
ACTCCCTCAG 
GGCCCTGCTC 
GCATCGATGG 
GGCCTCCGAT 
TGGGGGGCGC 
TGTCTCAGAG 
TGCTAACCTG 
GGTGTOCCAG 
GGAGTGGGCT 
TGCAGGTCCT 
GGTGGGCCAG 
GGCCTCCCCA 
ACTGGACTGG 
GCCCACATAG 
GAAAAACTGC 
TTTACAGCTT 
GCCCCGGCTG 
GCCCTAGGAC 
ATCCGCGAGG 
CCCGGAAATG 
AATCTGCCCC 
AAGGAAAAGG 
GCCAGGAGTT 
AAAAAGAAAG 
CAATACACTA 
ATCAGAGAGA 
GAAACATGAA 



31 

I 

TAGAATGGAG 
GAGGTGCTGC 
CTGCAACGAG 
GGGGCTGCAG 
CGGTGCCTGG 
TCCGGGCCCC 
ATCCTCATGC 
CGCATCACGC 
GCCCTGAGCC 
GCGTGGGCCC 
CCACCCTCTC 
CCTGCCGCCC 
GCTTGACTCC 
TAGTCCGCAG 
CGTCCCCCCG 
CGCCAGGCTG 
CAAGGGCAGC 
GGAAGTAGAT 
GCCCCAGGGA 
CGGATCGGCA 
GTGATGGCCT 
TGTGAGCCTG 
CTGTTTCCCC 
ACGCCCAGCC 
GAGAACCCCC 
CCCCTGCCCA 
AGCCCAACCT 
GTTCTGCAGC 
GCGGGGTCAG 
AGGGCCCCCT 
GAGGGGCCCT 
CCCACAGCAA 
GACAGGCCCA 
TTCCAGGGGA 
GAGGGCCTGT 
TGGCAGCCAG 
CGTCTGCCTT 
AAGCAGGAGA 
AGCTGGACCC 
CTTTCAGCCT 
GAAATCAGGC 
GCAGGGTCTA 
GCTGGGCGGG 
TGCCAGTAGC 
TCTCAGGATG 
AGAGGAACAC 
AACATCTCAG 
CCAGAGCAGC 
AAAGAAAATG 
TGAGACCCAG 
ATATAAAGTA 



41 

I 

CACGAGGACA 
TGCAGGGTTT 
TGCAGGAGCG 
GGAGCCCCCG 
GGGAGCTGCT 
CCTGCCCTGC 
TGAAGGAGCA 
AGCTGGAGCA 
AGCAGGACGG 
CCAGGGCCAG 
TGGCTGGAGA 
TTGCCAGATG 
GTTTKGGCTC 
CTACTACTGG 
TTTCCAGCGG 
CACTTCCAAC 
TTCCCGCTCA 
GGAGGGGGTG 
TAGCGGTCGG 
CGCCGGGTGG 
TCCCCCTCTT 
GCTCCCCAGG 
CGACTCAGGA 
TGTCCCCAGG 
AGGGTACAGG 
GGCCCACTCC 
GGAGGGTCCC 
CCAGGGCCCC 
TGCGTGGGGG 
CGTGTCCAGG 
GGCAGGCAGC 
CCCCACAGAG 
AGTCAGCCCA 
CATAAGGATG 
GCCCCACAGC 
GGAGAAGCCC 
TGAGGGTGCC 
CAGAACAGTG 
CGCAGCTGAA 
TGGTGTTCCG 
TAGTGAGTGG 
GGTGGCTGGC 
TCAGTCTCCG 
GTGTGCAGGT 
TTGAAATGTG 
ACCCACACCA 
CCGGGCGTGG 
CTGGGCAACG 
AGAGATCCAG 
CAGAAGCAAC 
ACAGTGTTTT 



51 

1 

CTGACATGGA 
GGAGATGATG 
CCAGCGCCGC 
CCCACTGGGG 
GGCTGCAGCC 
CCTGACGTCC 
GAACCGACTC 
GGAGAAGTCG 
GGGACCTCTG 
CCTGGCACTC 
CCCCCGGCAG 
GGCTCCCCAG 
CTGGTTGYTG 
CCGCTGTCAG 
TGCCGCCCTG 
AACGGGCAGC 
ACCAGGGCAC 
GGGACGGCCT 
ACTTCAGGTT 
GCGAGAGCTT 
GGCCGGGACG 
AGGGCCCCCA 
TTTCCAAGGC 
TTTCAGCTGG 
AGGAGGCTGG 
CGCTGGTGCT 
AGTGTCACCA 
CGATGCGGGG 
GCGCAGGGCC 
GCACTTTGGT 
GTGGCAACTC 
CCACATTCCC 
GCATGCAGCT 
TCAGGCCTGG 
CCCAGCACCC 
CCCGTCAGCA 
TGCCATGCCC 
TCTGTCCCGG 
GCGGAAATGT 
TGCAAGGTGA 
CCCTGGAGAC 
AGAGGCACAT 
TGCAGGATGT 
ACATACACGT 
TCCTTGGGGG 
GGCCTCAGGA 
TGGTTCACGC 
CAGTGAGAGA 
GTTTAAAAAT 
AGATTGACTC 
ATATATCTAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
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SEQ ID NO:197 LBG2 DNA SEQUENCE 

Nucleic Add Accession #: X63529 

Coding sequence: 64-2543 (start and stop codons are underlined) 

1 II 21 31 41 51 
I I ! I I I 

GCGGAACACC GGCCCGCCGT CGCGGCAGCT GCTTCACCCC TCTCTCTGCA OCCATGGGGC 60 
TCCCTCGTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTGCTGGCTG CAGTGCGCGG 120 
CCTCCGAGCC GTOCCGGGCO GTCTTCAGGO AGGCTGAAGT GACCTTGGAG GCGGGAGGCG 180 
CGGAGCAGGA GCCCGGCCAG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAG AGC 240 
CAGCTCTGTT TAGCACTGAT A ATG ATGACT TCACTGTGCG GAATGGCG AG ACAGTCCAGO 300 
AAAGAAGGTC ACTG AAGGAA AGGAATCCAT TGAAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAGAC ACAA G AG AG ATTGG GTGGTTGCTC CAATATCTGT CCCTG AAAAT GGCAAGGGTC 420 
CCTTCCCCC A G AGACTG AAT CAGCTCAAGT CTAATAAAG A TAGAGACACC AAGATTTTCT 480 
ACAGCATCAC GGGGCCGGGG GCAGACAGCC CCCCTGAGGG TGTCTTCGCT GTAGAGAAGG 540 
AGACAGGCTG GTTGTTGTTG AATAAGCCAC TGGACCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CGCTGTGTCA GAG AATGGTG CCTCAGTGGA GGACCCCATG AACATCTCCA 660 
TCATCGTGAC CG ACCAGAAT GACCACAAGC CCAAGTTTAC CCAGGACACC TTCCGAGGGA 720 
GTGTCTTAGA GGGAGTCCTA CCAGGTACTT CTGTGATGCA GGTGACAGCC ACAGATGAGG 780 
ATG ATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGACCTCATG TTCACAATTC ACCGGAGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGCCT GGACCGGGAA AAAGTCCCTG AGTACACACT GACCATCCAG GCCACAGACA 960 
TGGATGGGGA CGGCTCCACC ACCACGGCAG TGGCAGTAGT GGAGATCCTT GATGCCAATG 1020 
ACAATGCTCC CATGTTTGAC CCCCAGAAGT ACGAGGCCCA TGTGCCTGAG AATGCAGTGG 1080 
GCCATGAGGT GCAGAGGCTG ACGGTCACTG ATCTGGACGC CCCCAACTCA CCAGCGTGGC 1 140 
GTGCCACCTA CCTTATCATG GGCGGTGACG ACGGGGACCA TTTTACCATC ACCACCCACC 1200 
CTGAGAGCAA CCAGGGCATC CTGACAACCA GGAAGGGTTT GGATTTTGAG GCCAAAAACC 1260 
AGCACACCCT GTACGTTGAA GTGACCAACG AGGCCCCTTT TGTGCTGAAG CTCCCAACCT 1320 
CCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCACCTGTG TTTGTCCCAC 1380 
CCTCCAAAGT CGTTGAGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTGACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACCCAG 1500 
CAGGGTGGCT AGCCATGOAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTOGACC 1560 
GTGAGG ATGA GCAGTTTGTG AGG AACAACA TCTATGAAGT CATGGTCTTG GCCATGG ACA 1620 
ATGGAAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
AOCATGGCCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGCGCC 1740 
ACGTGCTGAA CATCACGGAC AAGGACCTGT CTCCCCACAC CTCCCCTITC CAGGCCCAGC 1800 
TCACAGATGA CTCAGACATC TACTGGACGG CAGAGGTCAA CGAGGAAGGT GACACAGTGG 1860 
TCTTGTCCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTG 1920 
ACCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TGTGTGCGAC TGCCATGGCC 1980 
ATGTCGAAAC CTGCCCTGGA CCCIGG AAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTGTTCCTC CTGCTGGTGC TGCTTTTGTT GGTGAGAAAG AAGCGGAAGA 2100 
TCAAGGAGCC CCTCCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC CGAGGTCTGG 2220 
AGGOCAGGCC GG AGGTGGTT CTCCGCAATG ACGTGGCACC AACCATCATC CCG ACACOCA 2280 
TGTACCGTCC TAGGOCAGCC AACCCAG ATG AAATCGGCAA CTTTATAATT GAG AACCTGA 2340 
AGGCGGCTAA CACAGACCCC ACAGCCCCGC CCTACGACAC CCTCTTGGTG TTCGACTATG 2400 
AGGGCAGCGG CTCCGACGCC GCGTCCCTGA GCTCCCTCAC CTCCTCCGCC TCCGACCAAG 2460 
ACCAAGATTA CGATTATCTG AACGAGTGGG GCAGCCGCTT CAAG AAGCTG GCAGACATGT 2520 
ACGGTGGCGG GGAGGACGAC XA2GCGGCCT GCCTGCAGGG CTGGGGACCA AACGTCAGGC 2580 
CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT G AGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGG AG ACA GGCTATG AGT CTG ACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGGATGG AGG AATGTGG GCAGTTTGAC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAGAGGCCAA GTTTCCAGAA GCCTCTTACC TGOCGTAAAA 2820 
TGCTCAACCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACTTTCTCT 2880 
CTGGAATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT 1 1 1 1 1 U 1 11 AATGCTATCT 2940 
TCAAAACGTT AGAGAAAGTT CTTCAAAAGT GCAGCCCAOA GCTOCTGGGC CCA CTGGCC G 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAOAT 3120 
GAAGGGTGAG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



SEQ ID NO:198 L8G2 Protein sequence; 

Protein Accession #: CM45177 



1 II 21 31 41 51 

MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKIFPSKR ILRRHKRDWV VAPIS VPENG 120 
KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWIXLN KPLDREEIAK 180 
YELFGHAVSE NG AS VEDPMN 1SIIVTDQND HKPKFTQDTF RGSVLEGVLP GTS VMQVTAT 240 
DEDDAIYTYN GWAYSIHSQ EPKDPHDLMF TtHRSTGTIS VISSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 
AWRATYUMG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 
PTSTATTV VH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 
DPAOWLAMDP DSGQVTAVGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTUD 540 
VNDHGPVPEP RQITICNQSP VRHVLNITDK DLSPHTSPFQ AQLTDDSDIY WTABVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 
G AVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDN VFY YGEEGGGEED QDYDITQLHR 720 
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GLEARPEWL RNDVAPTQP TPMYRPRPAN PDEIGNFHE NLKAANTDPT APPYDT1XVF 780 
DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKLA DMYGGGEDD 

SEQ ID NO:199 OBIS DNA SEQUENCE 

Kudelc Add Accession #: NM.012152 

Coding sequence: 43-1104 (unoMned sequences correspond to start and slop codons) 



1 11 21 31 41 51 

I I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGQ CTAATTTAGC TOCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTOCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 4B0 

CTGCTCATTT TGCTTGTCTG GGOCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC COCGGGCCTG GTGGTTCTGC TCCTOGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ IP tfomm Proton swygnw; 

Protein Accession #: NP.038284 

1 11 21 31 41 51 

1111,1 

MNECHYDKEM DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VXAAVTKNRK 60 
FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWFLRQGLLD SSLTASLTNL 120 
LVIAVERHMS IMRMRVHSNL TKKRVTLLIL LVWAIAIFMG AVPTLGWNCL CNISACSSLA 180 
PIYSRSYLVF WTVSNLMAFL IMVWYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 
VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

SEQ ID NO:201 PAA6 DNA SEQUENCE 

Nuctetc Acid Accession ft AA569531 

Coding sequence: 1*504 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGACCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGACC AAAACTTTAT TGTCTCTGCT TTCATTTCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTT CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATOA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

ATACCCAGAG GGAACAAACG CTCCCCAAAA AGAGTTACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAG GCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCATTTG TGGATTTCAC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATC 660 

CCAGCTACTC fcTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG B40 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



SEQ ID NO:202 PAAfi Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

KTYSYSFFRP ELIVNHLNYV HSEANRRTKT KTLLSLLSFL DBTSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EDMRITQQSS QLYLWDMGGP TIFKNLWMSL IPRGNKRSPK RVTETILRDF 120 
KQKQSSKIQE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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SEQ ID NO:203 PAB2 ONA SEQUENCE 

Nucleic Add Accession #: XM.0501 97 

Coding sequence: 310-1971 (underlined sequences correspond to start and stop codons} 

5 1 * 11 21 31 41 51 

I I I I I I 

TCACACGTGC CAAGGGGCTG GCTCAGCGGA ACCAGCCTGC ACGCGCTGGC TCCGGGTGAC 60 

AGCCGCGCGC CTCGGCCAGG ATCTGAGTGA TGAGACGTGT CCCCACTGAG GTGCCCCACA 120 

GCAGCAGGTG TTGAGCATGG GCTGAGAAGC TGGACCGGCA CCAAAGGGCT GGCAGAAATG 180 

GGCGCCTGGC TGATTCCTAG GCAGTTGGCG GCAGCAAGGA GGAGAGGCCG CAGCTTCTGG 240 

AGCAGAGCCG AGACGAAGCA GTTCTGGAGT GCCTGAACGG CCCCCTGAGC CCTACCCGCC 300 

TGGCCCAC TA TG GTCCAGAG GCTGTGGGTG AGCCGCCTGC TGCGGCACCG GAAAGCCCAG 360 

CTCTTGCTGG TCAACCTGCT AACCTTTGGC CTGGAGGTGT GTTTGGCCGC AGGCATCACC 420 

TATGTGCCGC CTCTGCTGCT GGAAGTGGGG GTAGAGGAGA AGTTCATGAC CATGGTGCTG 480 

GGCATTGGTC CAGTGCTGGG CCTGGTCTGT GTCCCGCTCC TAGGCTCAGC CAGTGACCAC 540 

TGGCGTGGAC GCTATGGCCG CCGCCGGCCC TTCATCTGGG CACTGTCCTT GGGCATCCTG 600 

CTGAGCCTCT TTCTCATCCC AAGGGCCGGC TGGCTAGCAG GGCTGCTGTG CCCGGATCCC 660 

AGGCCCCTGG AGCTGGCACT GCTCATCCTG GGCGTGGGGC TGCTGGACTT CTGTGGCCAG 720 

GTGTGCTTCA CTCCACTGGA GGCCCTGCTC TCTGACCTCT TCCGGGAOCC GGACCACTGT 780 

CGCCAGGCCT ACTCTGTCTA TGCCTTCATG ATCAGTCTTG GGGGCTGCCT GGGCTACCTC 840 

CTGCCTGCCA TTGACTGGGA CACCAGTGCC CTGGCCCCCT ACCTGGGCAC CCAGGAGGAG 900 

TGCCTCTTTG GCCTGCTCAC CCTCATCTTC CTCACCTGCG TAGCAGCCAC ACTGCTGGTG 960 

GCTGAGGAGG CAGCGCTGGG CCCCACCGAO CCAGCAGAAG GGCTGTCGGC CCCCTOCTTG 1020 

TCGCCCCACT GCTGTCCATG CCGGGCCCGC TTGGCTTTCC GGAACCTGGG CGCCCTGCTT 1080 

CCCCGGCTGC ACCAGCTGTG CTGCCGCATG CCOCGCACCC TGCGCCGGCT CTTCGTGGCT 1140 

GAGCTGTGCA GCTGGATGGC ACTCATGACC TTCACGCTGT TTTACACGGA TTTCGTGGGC 1200 

GAGGGGCTGT ACCAGGGCGT GCCCAGAGCT GAGCCGGGCA CCGAGGCCCG GAGACACTAT 1260 

GATGAAGGCG TTCGGATGGG CAGCCTGGGG CTGTTCCTGC AGTGCGCCAT CTCCCTGGTC 1320 

TTCTCTCTGG TCATGGACCG GCTGGTGCAG CGATTCGGCA CTCGAGCAGT CTATTTGGCC 1380 

30 AGTGTGGCAG CTTTCCCTGT GGCTGCCGGT GCCACATGCC TOTCCCACAG TGTGGCCGTG 1440 

GTGACAGCTT CAGCCGCCCT CACCGGGTTC ACCTTCTCAG CCCTGCAGAT CCTGCCCTAC 1500 

ACACTGGCCT CCCTCTACCA CCGGGAGAAG CAGGTGTTCC TGCCCAAATA CCGAGGGGAC 1560 

ACTGGAGGTG CTAGCAGTGA GGACAGCCTG ATGACCAGCT TCCTGCCAGG CCCTAAGCCT 1620 

. GGAGCTCCCT TCCCTAATGG ACACGTGGGT GCTGGAGGCA GTGGCCTGCT CCCACCTCCA 1680 

35 CCCGCGCTCT GCGGGGCCTC TGCCTGTGAT GTCTCCGTAC GTGTGGTGGT GGGTGAGCCC 1740 

ACCGAGGCCA GGGTGGTTCC GGGCCGGGGC ATCTGCCTGG ACCTCGCCAT CCTGGATAGT 1800 

GCCTTCCTGC TGTCCCAGGT GGCCCCATCC CTGTTTATGG GCTCCATTGT CCAGCTCAGC 1860 

CAGTCTGTCA CTGCCTATAT GGTGTCTGCC GCAGGCCTGG GTCTGGTCGC CATTTACTTT 1920 

GCTACACAGG TAGTATTTGA CAAGAGCGAC TTGGCCAAAT ACTCAGC GTA GAAAACTTCC 1980 

40 AGCACATTGG GGTGGAGGGC CTGCCTCACT GGGTCCCAGC TCCCCGCTCC TGTTAGCCCC 2040 

ATGGGGCTGC CGGGCTGGCC GCCAGTTTCT GTTGCTGCCA AAGTAATGTG GCTCTCTGCT 2100 

GCCACCCTGT GCTGCTGAGG TGCGTAGCTG CACAGCTGGG GGCTGGGGCG TCCCTCTCCT 2160 

CTCTCCCCAG TCTCTAGGGC TGCCTGACTG GAGGCCTTCC AAGGGGGTTT CAGTCTGGAC 2220 

AC TTATACAGGG AGGCCAGAAG GGCTCCATGC ACTGGAATGC GGGGACTCTG CAGGTGGATT 2280 

45 ACCCAGGCTC AGGGTTAACA GCTAGCCTCC TAGTTGAGAC ACACCTAGAG AAGGGTTTTT 2340 

GGGAGCTGAA TAAACTCAGT CACCTGGTTT CCCATCTCTA AGCCCCTTAA CCTGCAGCTT 2400 

CGTTTAATGT AGCTCTTGCA TGGGAGTTTC TAGGATGAAA CACTCCTCCA TGGGATTTGA 2460 

ACATATGAAA GTTATTTGTA GGGGAAGAGT CCTGAGGGGC AACACACAAG AACCAGGTCC 2520 

CCTCAGCCCC ACAGGCACTG GTCTTTTTTG CTNGANTCCA CCCCCOCCCT CTTTACCCTT 2580 
TT 



10 
15 
20 
25 



50 
55 



SEQ ID N0304 PA R? Protein sequence: 
Protein Accession I: XP.050197 



1 11 21 31 41 51 

I I I I I I 

MVQRLWVSRL LRHRKAQLLL VNLLTFGLEV CLAAGITYVP PLLLEVGVBE KFMTMVLGIG 60 
60 PVLGLVCVPL LGSASDHWRG RYGRRRFFIW ALSLGILLSL FLIPRAGWLA GLLCPDPRPL 120 

BLALLILGVG LLDFCGQVCF TPLEALLSDL PRDPDHCRQA YSVYAFMISL GGCLGYLLFA 180 
IDWDTSALAP YLGTQEECLF GLLTLIFLTC VAATLLVAEE AALGPTEPAE GLSAPSLSPH 240 
CCPCRARLAF RNLGALLPRL HQLCCRMPRT LRRLFVAELC SWMALMTFTL FYTDFVGEGL 300 
YOGVPRAEPG TEARRHYDEG VRMGSLGLFL QCAISLVFSL VMDRLVQRFG TRAVYLASVA 360 
65 AFFVAAGATC LSHSVAWTA SAALTGFTFS ALQILPYTLA SLYHREKQVF LPKYRGDTGG 420 

ASSEDSLHTS FLPGPKPGAP FPNGHVGAGG SGLLPPPPAL CGASACDVSV RVWGEPTEA 480 
RWPGRGICL DLAILDSAFL LSQVAPSLFM GSIVQLSQSV TAYMVSAAGL GLVAIYFATQ 540 
WFDKSDLAK YSA 

70 SEQ ID NO S05 PAJ3 DNA SEQUENCE 

Nucleic Acid Accession #: AK002126 

Coding sequence: 1-1 593 (underlined sequences correspond to start and stop codons) 

75 1 11 21 31 41 51 

I I f I I I 

ATGGTTCGCC GGGGGCTGCT TGCGTGGATT TCCCGGGTGG TGGTTTTGCT GGTGCTCCTC 60 

TGCTGTGCTA TCTCTGTCCT GTACATGTTG GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

_ _ CTGGCACTGC CCAGGGCCAA CAGCCCCACG GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 180 

80 GAGTGGGAGG AGCAGCAGCG CAACTACGTG AGCAGCCTGA AGCGGCAOAT CGCACAGCTC 240 
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5 

10 
15 
20 
25 



35 
40 
45 
50 
55 
60 
65 



75 



AAGGAGGAGC TGCAGGAGAG GAGTGAGCAG CTCAGGAATG GGCAGTACCA AGCCAGCGAT 300 

GCTGCTGGCC TGGGTCTGOA CAGGAGCCCC CCAQAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 480 

CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATCCT GGCATAATAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGCGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

ATCAAAGGCT GGGGCGGAGA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

ATAGTGGTAC GGACGCCTGT GCGAGGACTC TTCCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



$EQ fD PQgW.PAfl Proton sequence; 
Protein Accession *: NP.060841 



ort 1 11 21 31 41 51 

30 | | | | | | 

MVRRGLLAWI SRVWLLVLL CCAISVLYML ACTPKGJDEEQ LALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KEELQBRSEQ LRNGQYQASD AAGLGLDRSP PBKTQADLLA 120 

FLHSQVDKAE VNAGVKLATB YAAVPFDSFT LQKVYQLETG LTRHPEEKPV RKDKRDELVE 180 

AIESALETLN NPAENSPNHR PYTASDFIEG IYRTERDKGT LYELTFKGDH KHEFKRLILF 240 

RPFGPIMKVK NEKLNMANTL INVIVPLAKR VDKFRQFMQN FREMCIEQDG RVHLTWYFG 300 

KEEINEVKGI LENTSKAANF RNFTFIQLNG EFSRGKGLOV GARFWKGSNV LLFFCDVDIY 360 

FTSEFLNTCR LNTQPGKKVF YPVLFSQYNP GIIYGHHDAV PPLEQQLVIK KETGFWRJDFG 420 

FGMTCQYRSD FINIGGFDLD IKGWGGEDVH LYRKYLHSNL IWRTFVRGL FHLWHEKRCM 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHED3AHLRK QKQKTSSKKT 



SEQ ID NO:207 PAJS DNA SEQUENCE 

Nucletc Acid Accession #: AF189723 

Coding sequence: 1-271 2 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I I I I i 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT 7GTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TTGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCTTGTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTA'PA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

_ AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGOCT TAATTGCTCT TGCAATGAAG 1260 

/U ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 13 BO 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CXZAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1560 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

on TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT 1860 

Oil TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGGAGTTG CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTAGTGGATG ATGATTTTCA AACCATAATG TCTGCAATCG AAGAGGGTAA AGGGATTTAT 2040 

AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGCAGC ATTAACTTTA 2100 

_ ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 2160 

J ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC GTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 2280 

CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

n TTTTTTGACA TGTTCAATGC ACTAAGTTCC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 2460 

1U GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 2640 

AAGGTTGAAA GGAGCAGGGA AAAGATCCAO AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAGTAT GA 
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?eq p np;2Q9 p/ws Prolan mmw\ 

Protein Accession #: AAF27813 



1 11 21 31 41 51 

20 | | | | | | 

KIPVLTSKKA SELPVSBVAS ILQADLQNGL NKCEVSHRRA FHGWNEFDIS EDEPLWKKYI 60 

SQFKNPLIML LLASAVISVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 120 

PECHCVREGK LEHTLARDLV PGDTVCLSVG DRVPADLRLF EAVDL SIDES SLTGETTPCS 180 

c KVTAPQPAAT KGDLASRSNI AFMGTLVRCG KAKGWIGTG ENSEFGEVFK MMQAEEAPKT 240 

25 PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEMFT ISVSLAVAAI PEGLPIWTV 300 

TLALGVMRMV KKRATVKKLP IVETLGCCNV ICSDKTGTLT KNEMTVTHIF TSDGLHABVT 360 

GVGYNQFGEV XVDGDWKGF YNPAVSRIVE AGCVCNDAVI RNNTLMGKPT EGALIALAMK 420 

MGIiDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRPEI CFMKGAYEQV IKYCTTYQSK 480 

„ GQTLTLTQQQ RDVYQQEKAR MGSAGLRVLA LASGPELGQL TPLGLVGIID PPRTGVKEAV 540 

30 TTLIASGVSI KMITGDSQET AVAIASRLGL YSKTSQSVSG EBIDAHDVQQ LSQIVPKVAV 600 

FYRASPRHKM KIIKSLQKNG SWAMTGDGV NDAVALKAAD IGVAMGQTGT DVCKEAADMI 660 

LVEDDFQTIM SAIEEGKGIY NNIKNFVRFQ LSTSIAALTL ISLATLMNFP NPLNAMQILW 720 

INIIMDGPPA QSLGVEPVDK DVIRKPPRNW KDSILTKNLI LKILVSSIII VCGTLFVFWR 780 

^ c ELRDNVITPR DTTMTFTCFV FFDMFNALSS RSQTKSVFEI GLCSNRMFCY AVLGSIKGQL 840 

35 LVTYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSF 900 
LEV 

SEQ ID NO:209 PAV4 VARIANT 1 DNA SEQUENCE 

Nxletc Acid Accession #: N82096 
40 Coefing sequence: 1-1284 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGCC TTATTCAATG . 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

55 TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTA*fG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

65 TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 

gEQ, p fWaiO PAY* WWl 1 P*Wn Sequence; 
/ U Protein Accession ft none found 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 60 

LVNKTFGFPG YLLLSVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 120 

GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVMA RAISLGPHIP KTEDAWVPAK 180 

PNAIQAVGVM SFAFICHHNS FLVYSSLEEP TVAKWSRLIH MSZVISVFIC IFFATCGYLT 240 

FTGFTQGDLF ENYCRNDDLV TFGRFCYGVT VILTYPMECF VTREVIANVF FGGNLSSVFH 300 

OA IWTVMVITV ATLVSLLIDC LOIVLELNGV LCATPLIPII PSACYLKLSE EPRTHSDKIM 360 

80 SCVMLPIGAV VMVFGFVMAI TNTQDCTHGQ EHFYCPPDNF SLTNTSESHV QQTTQLSTLN 420 
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ISIFQLE 

SEQR>N0:211 PAV4 VARIANT 2 DNA SEQUENCE 

Nucleic Acid Accession*: NB2096 

Coding sequence: 1-1203 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

I I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAQT TTTCCCTTGT TTTATTQATA 60 

AAAGGAGGGG CCCTCTCTGQ AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGOC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAQT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 



SEQ ID NO:212 PAV4 Variant 2 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

1 I I I 1 I 

HGYQRQEFVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 

SYNIIAGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VHARAISLGP HIPKTEDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 

EEPTVAKWSR LIHHSIV1SV FICIFFATCG YLTFTGFTQG DLFENYCRND DLVTFGRFCY 240 

GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCLGIVLEL 300 

NGVLCATPLI FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLNISDFQLE 

SEQ ID NO:213 PAV4 VARIANT 3 DNA SEQUENCE 

NucJelc Acid Accession*: N62098 

Coding sequence: 1-1 140 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 ! I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGG8 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 4 B0 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 

SEQ tD NO:214 PAV4 Variant 3 Protein sequence: 
Protein Accession #: none found 

l n 21 31 41 51 

I I I I I I 
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MGYQRQEPVI PPQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVFQRIPGVD 60 

PENVPIGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSLIST GLTTLXLGIV MARAISLGPH 120 

IPKTEDAWVF AKPNAIQAVG VMSFAFICHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVF 180 

ICIFFATCGY LTPTGFTQGD LFENYCRNDD LVTFGRFCYG VTVILTYPME CFVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF IIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWHVFGFVM AITNTQDCTH GQEMFYCFPD NFELTNTSES 360 
HVQQTTQLST LNISIFQLB 

SEQ ID NO:215 PAV4 VARIANT 4 DNA SEQUENCE: 

Nucleic Acid Accession #: N62096 

Coding sequence: 1-1309 (underlined sequences correspond to start end stop codons) 

i 11 21 31 41 51 

I I I I I 1 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGATTTAGA TGACAGAGAA . 60 

ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 120 

GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 

GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCCCTT 240 

GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 300 

AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 420 

ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 

ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 

TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 

TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 

ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 

TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 

GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 

TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 

AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTG CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SPQ |D Nft2l6 PAV4 VaM4 PfQtetn WWWff 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVT PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYNTIA GDTLSKVFQR IPGVDPENVF IGRHFIIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKPNA IQAVGVHSFA FICHHNSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECFVTR EVIANVFFGG NLSSVFHIW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV FGFVMAITOT 420 
QDCTHGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

SEQ 10 N0217 PAV9 DNA SEQUENCE 

Nucleic Add Accession*: NM_017636 

Coding sequence: 1-3501 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTGTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTGTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA S40 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGQ AATTCGAGAC CATAGTTTTG 
GAGGCCTCAG CCTACCTGGA TGAGCTGCGT 
GCCCAGAGTG AACTCTTTCG GGGGOACATC 
CTCATGGAC6 CCCTGCTGAA TGACCGGCCT 
CTCAGCCTGG GCCACTTCCT GACCCCGATG 
TCCAACTCGC TCATCCGCAA CCTTTTGGAC 
CCAGCCCTAA AAGGGGGAGC TGCGGA6CTC 
ATGCTGCTGG GGAAGATGTG CGCGCCGAGO 
CCAGGCCAGG GCTTCGGGGA GAGCATGTAT 
TCGCTGGATG CTGGCCTCGG GCAGGCCCCC 
CTGAACAGGG CACAGATGGC CATGTACTTC 
GCTCTTGGGG CCTGTTTGCT GCTCCGGOTO 
GCAGCACGGA GGAAAGACCT GGCGTTCAAG 
GAGTGCTATC GCAGCAGTGA GGTGAGGGCT 
TGGGGGGATG CCACTTGCCT CCAGCTGGCC 
CAGGATGGGG TACAGTCTCT GCTGACACAG 
CCCATCTGGG CCCTGGTTCT CGCCTTCTTT 
ACCTTCAGGA AATCAGAAGA GGAGCCCACA 
GTCATTAATG GGGAAGGGCC TGTCGGGACG 
GTCCOGCGCC AGTCGGGCCG TCCGGGTTGC 
CTACGCCGCT GGTTCCACTT CTGGGGCGCG 
AGCTACCTGC TGTTCCTGCT GCTTTTCTCG 
CCGCCCGGCT CCCTGGAGCT GCTGCTCTAT 
CTGCGCCAGG GCCTGAGCGG AGGCGGGGGC 
CATGCCTCAC TGAGCCAGCG CCTGCGCCTC 
CTAGTGGCTC TCACCTGCTT CCTCCTGGGC 
CACCTGGGCC GCACTGTCCT CTGCATCGAC 
ATCTTCACGG TCAACAAACA GCTGGGGCCC 
GACGTGTTCT TCTTCCTCTT CTTCCTCGGC 
GAGGGGCTCC TGAGGCCACG GGACAGTGAC 
CGTCCCTACC TGCAGATCTT CGGGCAGATT 
GAGCACAGCA ACTGCTCGTC GGAGCCCGGC 
GGCACCTGCG TCTCCCAGTA TGCCAACTGG 
CTCGTGGCCA ACATCCTGCT GGTCAACTTG 
AAAGTACAGG GCAACAGCGA TCTCTACTGG 
TTCCACTCTC GGCCCGCGCT GGCCCCGCCC 
CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC 
TTCCGGGTTT ACCTTTCTAA GGAAGCCGAG 
AAGGAGAACT TTCTGCTGGC ACGCGCTAGG 
AAGCGCACGT CCCAGAAGGT GGACTTGGCA 
GAACAGCGCC TGAAAGTGCT GGAGCGGGAG 
GTGGCCGAGG CCCTGAGCCG CTCTGCCTTG 
CTGCCTGGGT CCAAAGA CTG A 



AAGGCCCTTG TGAAGGCCTG TGGGAGCTCG 1020 

TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1140 

GAGTTCGTGC GCTTGCTCAT TTCCCACGGC 1200 

CGCCTGGCCC AACTCTACAG CGCGGCGCCC 1260 

CAGGCGTCCC ACAGCGCAGG CACCAAAGCC 1320 

CGGCCCCCTG ACGTGGGGCA TGTGCTGAGG 1380 

TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CTGCTCTCGG ACAAGGCCAC CTCGCCGCTC 1500 

TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GCCCGCCTCC TCCTCCGTCG CTGCCCGCTC 1800 

ATGCAAGCTG ACGCCCGTGC CTTCTTTGCC 1860 

AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

TGCCCTCCAC TCATCTACAC CCGCCTCATC 1980 

CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

GCGGACCCAG CCGAGAAGAC GCCGCTGGGO 2100 

TGCGGGGGCC GCTGCGGGGG GCGCCGGTGC 2160 

CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

CGGGTGCTGC TCGTGGATTT CCAGCCGGCG 2280 

TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

AGCCTCGCCA GCGGGGGCCC CGGGCCTGGC 2400 

TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460 

GTGGGCTGCC GGCTGACCCC GGGTTTGTAC 2520 

TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

AAGATCGTCA TCGTGAGCAA GATGATGAAG 2640 

GTGTGGCTGG TAGCCTATGG CGTGGCCACG 2700 

TTCCCAAGTA TCCTGCGCCG CGTCTTCTAC 2760 

CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

CTGGTGGTGC TGCTCCTCOT CATCTTCCTG 2940 

CTCATTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTTATCGTCA TCTCCCACTT GCGCCTCCTG 3120 

CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

GACAAGCGGG AGAGCGACTC CGAGCGTCTG 3300 

CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 



SEQ ID N0318 PAV9 Protein sequence: 

Protein Accession f: none found 

1 11 21 31 41 51 

1 I I I I I 

HEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD PAAVYSLVTR 60 

TWGFRAPNLV VSVLGGSGGP VLQTWLQDLL RRGLVRAAQS TGAWIVTGGL HTGIGRHVGV 120 

AVRDHQMAST GGTKWAHGV APWGWFNRD TLINPKGSFP ARYRWRGDPE DGVQFPLDYN 180 

YSAFFLVDDG THGCLGGENR FRLRLESYIS QQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

ENATQAQLPC LLVAGSGGAA DCLAETLEDT IAPGSGGARQ GEARDRIRRF FPKGDLEVLQ 300 

AQVERIKTRK ELLTVYSSED GSEEFETIVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSELFPvGDI QWRSFHLEAS LMDALLNDRP EFVRLLISHG LSLGHFLTFH RLAQLYSAAP 420 

SNSLIRNLLD QASHSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKMCAPR YPSGGAWDPH 480 

PGQGFGESMY LLSDKATSPL SLDAGLGQAP WSDLLLWALL LNRAQMAMYP WEMGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAFK FEGMGVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT PIWALVLAFF CPPLIYTRLI 660 

TFRKSEEEPT REELEFDMDS VINGEGPVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHPWGA PVTIFMGNW SYLLFLLLFS RVLLVDFQPA PPGSLELLLY FWAFTLLCEE 780 

LRQGLSGGGG SLASGGPGPG HASLSQRLRL YLADSWNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMMK DVFFFLFFLG VWLVAYGVAT 900 

EGLLRPRDSD FPSILRKVPY RPYLQIFGQI PQEDMDVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYANW LWLLLVIPL LVANILLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRLIRE 1020 

FHSRPALAPP PIVISHLRLL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIREY EQRLKVLERE VQQCSRVLGW 1140 
VAEALSRSAL LPPGGPPPPD LPGSKD 

SE0 ID NO:219 PBF1 DMA SEQUENCE 

Nucleic Add Accession*: AA054237 

Coding sequence: 1-894 (underlined sequences correspond to start and stop coders) 

1 11 21 31 41 51 

I I I I t I 

ATGGAGCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA CCGACCCCOG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC ACCTGCCGCT GCGGGACTCG CCCCCGCTGG GGCGCOOGCT GCTCCCGGGC 240 

GGCCCGGGGC GCGCCQACCC CGAGTCCTGG CGCTCGCTCC TGGGGCTCGG CGGGCTGGAC 300 

GCCGAGTGCG GCCGGCCCCT CTTOGCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCATCG ACCGGGACAT CGACACCCTC ATCCTGAAAG GTATTGCGCA GCGATGCACQ 420 

5 GCCATCAAGT ACCACTTTTC TCAGCCCATC CGCTTGCGAA ACATTCCTTT TAATTTAACC 480 

AAGACCATAC AGCAAQATGA GTGGCACCTG CTTCATTTAA GAAGAATCAC TGCTGGCTTC 540 

CTCGGCATGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTGT TCCTCATGAC AGGGATATTT 660 

TGCACCATTT CCCTCTGTAC TTATGCCGCC AGTATCTCGT ATGATTTGAA CCGGCTOCCA 720 

10 AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGGTGCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCTGCATCGC TTATCCGTTT 840 
ATTAGCCGGA CCAAGATTGC ACAGCTAAAG TCTGGCAGAG ACTCCACGGT ATGA 

IS SEO ID NO:220 PBF1 Protein sequence: 

Protein Accession #: none found 

1 11 21 31 41 51 

on I I I I I I 

ZU MEPRALVTAL SLGLSLCSU3 LLVTAIFTDH WYETDPRRHK ESCERSRAGA DPPDQKNRLM 60 

PLSHLFLRDS PPLGRRLLPG GPGRADPESW RSLLGLGGLD AECGRPLFAT YSGLWRKCYF 120 

LGIDRDIDTL ILKGIAQRCT AIKYHFSQPX RLRNI PFNLT KTIQQDEWHL LHLRRITAGF 180 

LGMAVAVLLC GCIVATVSFP WEESLTQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 240 
KLIYSLPADV EHGYSWSIPC AWCSLGPIVA AGGLCIAYPF ISRTKIAQLK SGRDSTV 



25 
30 



SEQ ID NO:221 PCM DMA SEQUENCE 

Nucleic Acid Accession #: NM.016570 

Coding sequence: 1- 1 134 (underlined sequences correspond to start and stop codons) 



1 U 21 31 41 51 

tilt' 1 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

35 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

40 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

45 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGfG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

50 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

55 $EQ |p NQ:?22 PC|4 prgtein sequence; 

Protein Accession #: NP_057654 

1 11 21 31 41 51 

*n 1 I I 1 1 1 

OU MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEHSL QDVTFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 

IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

65 MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIXCCRFRLG SVKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEQ 10 NO:223 PEZ3 DNA SEQUENCE 

70 Nucleic Acid Accession #: NM.001935.1 

Coding sequence: 76-2301 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 I | | 1 I I 

CGCGCGTCTC CGCCGCCCGC GTGACTTCTG CCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACACCGTGG AAGATTCTTC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCCCGTG GTTCTGCTGA ACAAAGGCAC AGATGATGCT 180 

ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTCGTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

GATGAGTTTG GACATTCTAT CAATGATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

c GATTTAAATA AAAGGCAGCT GATTACAGAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

5 ACATGGTCAC CAGTGGGTCA TAAATTGGCA TATGTTTGGA ACAATGACAT TTATGTTAAA 600 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACATGGACGG GGAAAGAAGA TATAATATAT 660 

AATGGAATAA CTGACTGGGT TTATGAAGAG GAAGTCTTCA GTGCCTACTC TGCTCTGTGG 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

f A ATTGAATACT CCTTCTACTC TGATGAGTCA CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

' 10 TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCATA CAAATCACTG CTCCTGCTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATG GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

AGATGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAATGA GTACTACTGG CTGGGTTGGA 1140 

15 AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATGAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGGCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

TACTACATTA GTAATGAATA TAAAGGAATG CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

^- CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

20 TACTATTCTG TGTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCOGGTCCT 1500 

GGTCTGCCCC TCTATACTCT ACACAGCAGC GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

„ AAATCCAAGA AATATCCTCT ACTATTAGAT GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

25 GACACTGTCT TCAGACTGAA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGGAAGTGGC GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

30 GTATOCCGGT GGGAGTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTACCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTGATGAAG ACCATGGA AT AGC TAGCAGC ACAGCACACC AACATATATA TACCCACATG 2340 

35 AGCCACTTCA TAAAACAATG TTTCTCTTTA CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

40 CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTG GACTGGTTCA AATGTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

TGGCTGGGAA CCCAAGTCCA AGCATACCAA CACGAGCAGG CTACTGTCAG CTCCCCTCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

45 CAGGAAATCA AATATCGAAA GCACTGACTT CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT CCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

50 TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTTAATAAA TGTGCCTTCA TTTTTTC 

55 SEP ID NQ:224 PEZ3 Protein sequence: 

Protein Accession*: NPJW1928.1 

1 11 21 31 41 51 

^ A I I I I I i 

OU MKTPWKILLG LLGAAALVTI ITVPWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISDHEYLY KQENNILVFN AEYGN5SVFL ENSTFDKFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPN NTQWVTWSPV GHKLAYVWNN DIYVKIEPNL 180 

PSYRITWTGK EDIIYNGITD WVYEBBVFSA YSALWWSPNG TFLAYAQFND TEVPLIBYSF 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKFFV VNTDSLSSVT NATSIQITAP ASMLIGDHYL 300 

65 CDVTWATQER ISLQWLRRIQ NYSVMDICDY DESSGKWNCL VARQHIEMST TGWVGHFRPS 360 

EPHFTLDGNS FYKIISNEEG YRHICYFQID KKDCTFITKG TWEVIGIEAL TSDYLYYISN 420 

EYKGMPGGRN LYKIQLIDYT KVTCLSCELN PERCQYYSVS FSKEAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLQKVQMP SKKLDFIILN ETKFWYQMIL PPHFDKSKKY 540 

„ PLLLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDKI MHAINRRLGT 600 

70 FEVEDQIEAA RQFSKMGFVD NKRIAIWGWS YGGYVTSMVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY MGLPTPEDNL DHYRNSTVMS RAENFKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVDF QAMWYTDEDH GIASSTAHQH IYTHMSHFIK QCFSLP 

SEQ ID N0:225 PBJ2 DNA SEQUENCE 

75 Nucleic Acid Accession*: none found 

Oxfinfl sequence: 1-261 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

80 | | | | I I 
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ATGOCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AOAAGTGTGA TTAAAGTGCG TGCTAACCAG TGTTCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGCCCACAC TGGATGT CTA A 



SEQ ID NO:226 PBJg Protein sequence: 

Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

MALAKVREPN ANDNAIRVDN RSVIKVRANQ CSLHEAESES RNPQELWMGL LLLKGVLEAC 60 
VEMRPLS VWS LRDDKEQSPH QPTUDV 

SEQ ID N0227 PBM2 DNA SEQUENCE 

Nucleic Acid Accession*: none found 

Coding sequence: 1-482 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGCCAAATG CTGAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTTG CTGTATGTTG TGGATCAGCA AATATAGTCA GCCCTCTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTGGACAGA OGGCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GCTAGAAGAT 240 

TTTATGGCTA TTGAAGAAGA AATGAAGAAG CACGGAAGTA CTCATGTGGG ATTCCCAGAA 300 

AACCTGACTA ATGGTGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAAGCCA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 420 
GTCAAAGATC AGATAGTTGT AGATATGCGG CGTTATTT CT GA 

SEQ ID NO:22q PBM2 Protein sequence: 

Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

KFNAE LEAKS LGSSKCLKTA LILAVCCGSA NIVSPLLEQN IDVSSQDLDR RPESMLPIATC 60 
IMWTSFVEDN LSMGWGKLED FMAIEEEMKK HGSTHVGFPE NLTNGAAAGN GDDGLIPPRK 120 
SRTPESQQFP DTENEEYHRF VKDQIV VDMR RYF 

SEQ ID NO:229 PEZ2 DNA SEQUENCE 

Nucleic Acid Accession* NM.014253 

Coding sequence: 65-6242 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

AGAGATGGAG CAAACTGACT GCAAACCCTA CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GGATCTAGCT TACACCAGTT CTTCTGATGA GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

ATACAACTCC AGGGAGACCC TGCACGAGTA TAACCAGGAG CTGAGGATGA ATTACAATAG 240 

CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

CTCTCACACT CTGTGCTCTG GCTACCAAAC AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

CCAGCTAGAG ATGGGATCTG ATGTGGACAC AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

TGCACTAAGA ATGTGGATAA GGGGAATGAA ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

GGCCAACTCT GCATTATCCT TGACTGACAC TGACCATGAA AGGAAGTCTQ ATGGGGAAAA S40 

TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

TGTGCAGAGC AGCCCACACA ACCAGTTCAC CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

TCCTCATGCC TGCACCTGTG CCAGGAAGCC ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

ATCAATGACT ACCCGCAGCC AGCCCAGCCC AGCTGCTCCA GCTCCCCCAA CCAGCACGCA 780 

GGATTCAGTC CATCTGCATA ACAGCTGGGT CCTGAACAGC AACATACCAT TGGAGACCAG 840 

GCATTCCCTO TTCAAACATG GATCTGGTTC CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

CTACCCTCTG ACATCCAATA CCGTGTACTC GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

CTTTTCCCGA CCTGCCTTTA CCTTTAACAA ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

AGCATTGAGC GCCACTGCAA TCACAGTGAC TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

AGTGCATTTG TTCGGCCTGA CTTGGCAGTT GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

TGGAGTTAGC AAAGGGAACA GGGGGACCGA GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGGAAAAGTT TCTGATAAAT CAGAGAAAAA AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

TGGAGAAGTT GACATTGGTG GACAGGTCAT GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TTTCCAGATT ACTATCCACC ATCCAATATA TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CTCTCTGCTG GGAATTTATG GCAGAAGAAA CATTCCACCT ACACATACTC AGTTTGATTT 1440 

TGTAAAACTA ATGGATGGCA AACAGCTGGT CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

ACAGCACTCC CCTCGGAACC TGATCTTAAC TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

TATGGATCAA GGACCTTGGT ATCTGGCGTT TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

ATTCGTGTTA ACTACAGCAA TTGAAATAAT GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TGGAGAGTGT ATCTCTGGCC ATTGTCATTG TTTCCCAGGA TTCCTTGGAC CTGACTGTGC 1740 

TAGAGATTCC TGCCCTGTGC TGTGTGGTGG GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG TGACGTTCCG GAAGAACAAT GCATTGATCC I860 

AACATGCTTT GGCCACGGCA CCTGCATCAT GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGCGAGGAAG AGGACTGCCT AGACCCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT. CAAGAGCAGT GCTCAGGACA CGGAACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTOTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAACGCTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGGGAGGG CGACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTCT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAACCG ACTGTGTGGA 2520 

TOCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTGGC CATCGGTGGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC CCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAGAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TGGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGOTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATG GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCCCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TGTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTCTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

CGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCCGAGGG 5040 

ACACCTGACC AATGCAACGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGACCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATOG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC CTGGGCAAAT GCAACATCTC - 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAGTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG '5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCCCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCGAGTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCATAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATGATATTT TTGAATATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA CTATGATCGG CTTGGGCGAC GTGTCGOGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAQTT CTTTGTCGAC GCGACCGCGA ACCCCATAAQ 6900 
_ AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 
5 AGGTCACCTT ATTGCCATGG AGTTAAGCAG TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 
TACAGGTACC CCACTAGCTG TGTTCAGCAG CCGAGGTCAG GTCATAAAGG AGATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CTGGGGCAAA GGGATTATGA 7200 
1 TGTTGTTGCT GGCAGATGGA CAACGGCCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 

10 TCCTAAACCA TTCAACCTCT ACTCCTTTGA AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 
TGTTGCAAAG TATACCACAG ACATCAGAAG TTGGTTGGAG CTATTTGGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGGCTT CAGACAAAAA CTCAAGAGTG GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 
1 GTGTGAACTC CAGAAACAGC TCAGGAATTT CATTTCCTTG GACCAACTAC CTATGACTCC 7560 

15 CCGATACAAT GATGGACGGT GCCTTGAAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTGTTTTT GGGAAAGGTA TAAAATTTGC CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTGCTGCC ATTCTCAATA ATGCCCATTA 7740 
CCTGGAAAAC CTACATTTTA CCATAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
_ A GTCTCTGGAG GAAGACCTGG TGCTCATCGG TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 
20 TGGTGTCAAT GTCACTGTGT CCCAGATGAC TTCTCTGTTG AATGGGAGGA CTAGACGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGG ATTAGGGCAT GGACAGAAGG 8100 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 
25 GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAGTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAG GT AA CAAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 
AAATATGGAG GAAAAACATA TCCAACTGCC TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 
^ ATTGTTTGTT TAAACTCTTT AAGAAATGAC AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 

30 CAAAATAACA CAAGTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 
ATTTGCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 
„ TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGAGTG 8760 
35 AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTACCATGCT TCCCTGTGGG 8880 
TGTGGTAACC AGACTGTATA GCCGCTATTT GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTGTATTGG TATCATGTAA ACATAGCTTT TATTAACCTG GGTAGGAATT TCTCATTTAT 9060 
40 ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TGATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGCCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 
A AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 

45 TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT TTTCGTTCTC CGTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAGGAAAAT AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
50 AATCTCTAGG AATCCTGCAG TAAAACAAGC CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTTTGTA AAATGCTGTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCCCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9640 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 
55 TTGAAAATAT GCAAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAG AATACCACTT ACACATGTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
^_ TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 
00 AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
65 ACAAATATTT GAAGCTTTTA CTTAATAGTG ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 
AATACGTATT TGGTTGGTTC GTGCCTTTAG TTTGTTAAAG TTACATTTOT ATTATATTCA 10680 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
70 TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATG AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
_ CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGTTTGT 11160 

75 GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGATTAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
Q AAAATAAGTG TOTCCTTTAC TGTCAATTTA TCGAGAAGAT CTATAATATA TAGACTACAT 11460 

80 ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAQQTATGAA ATTAAOTTAT AATTTTCATQ AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATGTTAT TTTGACTCTT 'ITXTl T m 1 11640 
TTTTCTTTAA AAATATATTT TTAACTAGAC CAGGCCCCAC TATAATATCA CTTAAGAQAG 11700 
_ TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
5 TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11820 
TTGGAGCATA TTATATATAG CTTGTGGAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
CCTGTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT CTTTCTTCTG CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
t . AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
1U AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCTTGGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
1 c AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
15 AAAAATATTT TACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTTTTTAGCT GCTTACTTTC TCATGAAAAO TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 
20 AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGO CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 



25 



60 
65 



SEQ P NO330 PEZ2 Protein sequence: 
Protein Accession #: NP.055068 



30 1 11 21 31 41 51 

I I 1 I I I 

MEQTDCKPYQ PLPKVKHEMD LAYTSSSDES EDGRKPRQSY NSRETLHEYN QELRMNYNSQ 60 

SRKRKEVEKS TQEMEFCBTS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

0 _. LRMWIRGHKS BHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSPVCCDM EAQAGSTQDV 180 

35 QSSPHNQFTF RPLPPPPPPP HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPRPLPRSTF 300 

SRPAFTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES MDTTYSPIGG KVSDKSEKKV FQKGRAIDTG EVDIGAQVMQ TIPPGLFWRF 420 

A QITIHHPIYL KFNISLAKDS LLGIYGRRNI PPTHTQFDFV KLKDGKQLVK QDSKGSDDTQ 480 

40 HSPRNLILTS LQETGFIEYM DQGPWYLAFY NDGKKMEQVF VLTTAIEIMD DCSTNCNGNG 540 

ECISGHCHCF PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKGPECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EICEEEDCLD PMCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TFLLDAGVCS CDPKWTGSDC STELCTKECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

45 GWHCVCQVGW SGTGCNWME HLCGDNLDND GDGLTDCVDP DCCQQSNCYI SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRIKFLIGKD STHVIPPEVS FDSRRACVIR GQWAIDOTP 900 

LVGVNVSFLH HSDYGPTISR QDGSFDLVAI GGISVTLIFD RSPFLPEKRT LWLPWNQFIV 960 

VEKVTMQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PERGTZVFEL QWQEEIPIP 1020 

- SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMIKVELTV AVEGRLTQKW FPAAINLVYT 1080 

50 FAWNKTDIYG QKVWGLAEAL VSVGYEYETC PDPILWEQRT WLQGFEMDA SNLGDWSLNK 1140 

HHILNPQSGI IHKGNGENMF ISQQPPVIST IMGNGHQRSV ACTNCNGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHKYY LAMDPVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

_ _ FVDGTMIRKI DENAVITTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDIi AVMPMDNSLY 1380 

55 VLDNNIVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IHSTLESARA ISVSHSGLLF 1440 

IAETDERKVN RIQQVTTNGE IYIIAGAPTD CDCKIDPNCD CFSGDGGYAK DAKMKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNQAHLND MNIYEIASPA DQELYQFTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGMPLWL WPGGQVYWL TISSNGVLKR 1620 

VSAQGYNPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSFHSDLEK 1680 

LTKVELDTSN RENVLMSTNL TATSTIYILK QENTQSTYRV NPDGSLRVTF ASGMBIGLSS 1740 

EPKILAGAVN PTLGKCNISL PGEHNANLIE WRQRKEQNKG NVSAFERRLR AHNRNLLSID 1800 

FDHITRTGKI YDDHRKFTLR ILYDQTGRPI LWSPVSRYNE VNITYSPSGI* VTFIQRGTWN 1860 

EKMEYDQSGK IISRTWADGK IWSYTYLEKS VMLLLHSQRR YIFEYDQSDC LLSVTMPSKV 1920 

RHSLQTMLSV GYYRNIYTPP DSSTSFIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTGVTL TYEESSGVIK TIHLMKDGFI CTIRYRQTGP LIGRQIFRFS EEGLVNARFD 2040 

YSYNNFKVTS MQAVINETPL PIDLYRYVDV SGRTEQFGKF SVINYDLNOV ITTTVMKHTK 2100 

IFSANGCVIE VQYEILKAIA YWMTIQYDNV GRHGNMCIRV GVDANITRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KMDEDGFLRQ 2220 

RGNDIFEYNS NQLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFP VDATANPIRV 2280 

70 THLYNHTSSE ITSLYYDLQG HLIAMELSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGQRDYDV VAGRWTTAYH HIWKQLNLLP 2400 

KPFNLYSFEN NYPVGKIQDV AKYTTDIRSW LELFGFQLHN VLPGFPKPEL ENLELTYELL 2460 

RLQTKTQEWD PGKTILGICC ELQKQLRNFI SLDQLPMTPR YNDGRCLEGG KQPRFAAVPS 2520 

VFGKGIKFAI KDGIVTADII GVANEDSRRL AAILNNAHYL ENLHFTIEGR DTHYFIKLGS 2580 

75 LEEDLVLIGN TGGRRILENG VNVTVSQMTS LLNGRTRRFA DIQLQHGALC FN1RYGTTVE 2640 

EEKNHVLEIA RQRAVAQAWT KEQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEQYLELSDS ANNIHFMRQS BZGRR 

SEQ ID NO:231 PFD4 0HA SEQUENCE: 

80 Nucleic AcW Accession* NM_000441 
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Coding sequence: 225-2567 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I f 

CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAACGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCGCGG GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCGGA GCCGCOGCAG CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGOAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TGCAGATTGG ATTCATAGTG AGGTACTTGG CAGATOCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GdTGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTGCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC TTGTTTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGGCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TGGGTTTACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGATAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAQ TGGOTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATG TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAOACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAOT GAGCCGAGGT 4440 

CGTGCCACTG CACTCCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AGAGTGAATG TAATAQTCTT GCAGAAAATG AATGAATACC TTTGTTCAAT 4560 

AAAGGAAATA TGCACTGCTC ACTTTTTTGA AGGAAATGCC AAAGTTACGT TTTACAACAA 4620 

GGCTAGAGTT TGTAAATTCT GGGTTCATTT GTGATGACAT AAGTCAQCAA ACTGCGGGAA 4680 

TACTGTCTCT TCTATGTATT TTGTGAATAG TAAGCATAAT TTTAGTTTTG TATTATCAAT 4740 

GAAAATTTCA CTTGAAATTA AAGCTGCCTT TTGTTATATT TTTAACCTAT AGGATAAGAT 4800 

TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA ATCATGTACA TTTGAAAATA 4860 

TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA ACTACTCGGA TQTGTCCTTT 4920 
CTGAACAAAA 



gEQ id mm PJP4 Protein mm 

Protein Accession #: 043511 



5 

10 
15 

1 11 21 31 41 51 

ill III 

MAAPGGRSEP PQLPEYSCSY MVSRPVYSEL AFQQQHERR& QERKTLRESL AKCCSCSRKR 60 

- AFGVLKTLVP ILEWLPKYRV KEWLLSDVIS GVSTGLVATL QGMAYALLAA VPVGYGLYSA 120 
2U PFPILTYPIF GTSRHISVGP FPWSLHVGS WLSKAPDEH FLVSSSNGTV LNTTKIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIFGGLQIGF ZVRYLADPLV GGFTTAAAFQ VLVSQLKXVL 240 

NVSTKNYNGV LSIIYTLVEI FQNIGDTNLA DFTAGLLTIV VCMAVKELND RFRHKIPVPI 300 

PIEVIVTIIA TAISYGANLE KNYNAGIVKS IPRGFLPPEL PPVSLFSEML AASPSIAWA 360 

c YAIAVSVGKV YATKYDYTID GNQEFIAFGI SNIFSGFFSC FVATTALSRT AVQESTGGKT 420 

25 QVAGIISAAI VMIAILALGK LLEPLQKSVL AAWIANLKG MFMQLCDIPR LWRQNKIDAV 480 

XWVFTCIVSI ILGLDLGLLA GLIFGLLTW LRVQFPSWNG LGSIPSTDIY KSTKNYKNIE 540 

EPQGVKILRF SSPIFYGNVD GFKKCIKSTV GPDAIRVYNK RLKALRKIQK LIKSGQLRAT 600 

KNGIISDAVS TNNAFEPDED IEDLEELPIP TKEIEIQVDW NSELPVKVNV PKVPIHSLVL 660 

DCGAISFLDV VGVRSLRVTV KEFQRIDVNV YFASLQDYVT EKLEQCGFFD BNIRKDTFFL 720 

30 TVKDAILYLQ NQVKSQEGQG SILETITLIQ DCKDTLELIE TELTEEELDV QDEAMRTLAS 780 
QDEAMRTLAS 

_ _ SEQID NO-.233 PFH2 DMA SEQUENCE: 

35 Nuctetc Acid Accession «: NMJM6029 

Coding sequence: 228-1097 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

40 | | | III 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 

AC TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTCGG 240 

45 TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGOC 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 

- A TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

50 ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 

CTCACATGAT CGAGAGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 

ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 

_ GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

55 GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 

TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 

CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

- n AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 

60 AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT GAATCTTGCA AA 

65 S5Q ip NPtmPffl? Protein WWOTTC 

Protein Accession #; NP.057113 

_ n 1 11 21 31 41 51 

70 i i I III 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPBMIER 180 

- KQGKIVTVNS IW3IISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
75 IVENSLAGEV TKTIGNNGDQ SHKHTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 

KPTWAWWITN KMGKKRIENF KSGVDADSSY PKIFKTKHD 

SEQ 10 NO:235 ACCS DNA SEQUENCE 

80 Nucleic Acid Accession #: NM_000450 
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Coding sequence: 1-1833 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I t I I I I 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGQA 60 

GCCTGGTCTT ACAACACCTC CACGGAA6CT ATGACTTATG ATGAGGCCAO TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GGTTGCAATT CAAAACAAAG AAGAGATTGA GTACCTAAAC 180 

TCCATATTGA GCTATTCACC AAGTTATTAC TGGATTGGAA TCAGAAAAGT CAACAATGTG 240 

TGGGTCTGGG TAGGAACCCA GAAACCTCTG ACAGAAGAAG CCAAGAACTG GGCTCCAGGT 300 

GAACCCAACA ATAGGCAAAA AGATGAGGAC TGCGTGGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TGTGGAATGA TGAGAGGTGC AGCAAGAAGA AGCTTGCCCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGGC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT 540 

ACAGCCCTGG AATCCCCTGA GCATGGAAGC CTGGTTTGCA GTCACCCACT GGGAAACTTC 600 

AGCTACAATT CTTCCTGCTC TATCAGCTGT GATAGGGGTT ACCTGCCAAG CAGCATGGAG 660 

ACCATGCAGT GTATGTCCTC TGGAGAATGG AGTGCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTGTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATGTTTCCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAG AAGGATTTGA ACTAATGGGA 840 

GCCCAGAGCC TTCAGTGTAC CTCATCTGGG AATTGGGACA ACGAGAAGCC AACGTGTAAA 900 

GCTGTGACAT GCAGGGCCGT CCGCCAGCCT CAGAATGGCT CTGTGAGGTG CAGCCATTCC 960 

CCTGCTGGAG AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACTCAAGGGC AGTGGACACA GCAAATCCCA 1080 

GTTTGTGAAG CTTTCCAGTG CACAGCCTTG TCCAACCCCG AGCGAGGCTA CATGAATTGT 1140 

CTTCCTAGTG CTTCTGGCAG TTTCCGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

GGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTGGGACAAC 1260 

GAGAAGCCCA CATGTGAAGC TGTGAQATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAGGTGTG CTCATTCCCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGGTTCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCCGGGA 1500 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGOCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCGGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTACCTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAGAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



?EQ ID Nfr23S AQPS Protein sequence; 
Protein Accession #: NP.0Q0441 



1 11 21 31 41 51 

I I I I I I 

MXASQFLSAL TLVLLIKESG AWSYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED* CVEIYIKREK 120 

DVGMWNDERC SKKKLALCYT AACTNTSCSG HGECVETINN YTCKCDPGFS GLKCBQXVNC 180 

TALES PEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GFVECFQNPG SFPWNTTCTF DCEEGPELMG AQSLQCTSSG NWDNEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGEFTFKSS CNFTCEEGFM LQGPAQVECT TQGQWTQQIP 360 

VCEAFQCTAL SNPERGYMNC LPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCGPTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE FTYKSSCAFS CEEGFELYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINHSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCEAP TESNIPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYIL 

SEQ ID N0;237 PM28 DNA SEQUENCE 

Nucleic Add Accession «: N51002 

Coding sequence: 1-3793 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACCX7 CAATGAGCCA AAGGGGGTCC 60 

CAAAGCAGTG GCTCGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AGAGACAGCT CAATTCAGCC 240 

CTGCCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGATCCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAGT ATCCAGTGAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCACC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCGT 660 

GAACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATGAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTTCTTCGCG AGTGGGAGAG 900 

GTGGAACAGG AAGCAGAGAC AGCAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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5 
10 
15 
20 
25 
30 
35 
40 
45 



AATGATAAAC 
AAAAACAGAC 
AGAAAGGCTG 
ACCAAGGCTG 
CTTGAAGAGA 
CATAACAAGA 
CAACTACACT 
TCAGAAACTT 
GAAATTGAAA 
GAACCCACAA 
TCCCTAGTGG 
GGCCGCAT6G 
AATAGAACTC 
TCTGATATTG 
AGTGGTCATT 
AACAAAGAAA 
ATTGAAAATA 
GGTACCTCCA 
GGACACTCAA 
GTCATGACAC 
GATGGTCGAG 
GCCCTCAGAA 
TCTGTCTCTC 
CTTCACAAAG 
AAAGAAAAAG 
GAGTCCCTGG 
AAGCATGAAC 
CCAACTGTGG 
TGCCGAGCCA 
AGAGAAATTG 
ATGGTTTCCC 
TGGGTGACTC 
GAAGGAAGCT 
CATGAGTGGA 
TTTATGGAAT 
GTCCATTTAA 
TTAAAGAGGT 
GAAATAAAAG 
GGACTTCGAG 
CTGGATGAAA 
ACCCAGGCAA 
AGGCGACTGG 
TTTCCTCCTC 
GCTGGATTTA 
TCATCAAGAC 
GCGGCCGCTT 



TAGAAAATGA 
AGTTACAAGA 
AAACCTTGCC 
AAGAGAGACA 
AGAATCAAGA 
GATTATCGGA 
TAAAGGAAAG 
TCAGAAAGAA 
AGCTGAGATC 
TACCAAGAAC 
ACAGCCAGTC 
GTGTGCGAAG 
AACAGATTGG 
ATGATGATGA 
CCGATGCCCA 
TCAGGCTAAT 
GAGTGGCTAG 
TTACTGCCTC 
CTCCAAAGCT 
TGCCAAGTGA 
AGGACAAAGC 
TGACTCACAC 
TTGAGCCAGA 
CCCCCAAGAA 
CTGGACTTGG 
GGTTAGGCAA 
TTCTTGAAGA 
TCGCATGCCT 
ACGTGAAGAG 
GAATCAGCAA 
TAACAAGTCC 
ATGAAGAAAT 
GGGCCCAGTG 
TTGGAAATGA 
GCTTGGTAGA 
AAATGGTGGA 
TGAATTATGA 
ACGTGTTGGT 
AATATGCAAA 
ACTTTGACTA 
GGCAGATTCT 
ATGAAAGTGA 
GTGAAGTACA 
GGTTAACCAC 
TGCAGAGGTT 
TAA 



GTTAGCAAAT 
ACGTCTTGAG 
TGAAGTAGAG 
TGGAAATATT 
ACTTCAAAGA 
TACGGTTGAT 
AATGGCTGCT 
TCTTGAAGAA 
TGAACTTGAC 
TCATCTAGAC 
TGATTACAGA 
AGATGAGCCA 
AGTACTAAGC 
CAOAGAAACA 
GACGCTAGCC 
TCAGGAAGAA 
TGTGAGCCTC 
TGTTACAGCT 
CACCCCTCGA 
TCTGAGGAAA 
AACAATTAAA 
TCTCCCTTCT 
AAGCCTCGGG 
GAAAGGAATC 
GCAGCTCCGA 
ACTCGGAACT 
AGCTCGGAGA 
AGAGCTTTGG 
TGGTGCCATC 
TCCACTGCAT 
TTCAGCTCCT 
GGAAAATCTT 
TCCGGTTTTT 
ATGGCTTCCC 
TGCAAGAATG 
TAGTTTCCAT 
CAQAAAAGAA 
GTGGAGCAAT 
TAATATACTT 
CAGCAGCTTA 
TGAAAGAGAA 
TGACAAGAAC 
TGGAATCAGC 
AACCTCTGGG 
AGACAACTCC 



AAAGAAGCTA 
CTAGCTGAAC 
GCTGAACTGG 
GAAGAACGTA 
GCTAGGCAAA 
AGACTTCTQA 
CTAGAAGAAA 
TCTTTACATG 
CAATTGAAAA 
ACCTCAGCTG 
ACAACTAAAG 
AAGGTGAAAT 
AGCCACCCTT 
ATTTTTAGCT 
ATGATGCTTC 
AAAGAATCTA 
GAAGGCCTGA 
TCATCGCTGG 
AGCOCTGCCA 
CATCGGAGAA 
TGTGAAACTT 
TCCTACCACA 
CTTGGTAGTG 
AAGTCTTCAA 
GGCTTTATGG 
CAAGCTGAGA 
AAGGGATTAC 
TTGGGAATGC 
ATGTCTGCTT 
CGCTTAAAAC 
CCAACATCTC 
GCAGCTCCAG 
CTACAGACCC 
AGCTTGGGGT 
TTAGATCACC 
CGAACAAGTT 
CTAGAAAGAA 
GACCGAATTA 
GAGAGCGGTG 
ACTTTATTAT 
TACAATAACC 
TTCAGACGTG 
ATGATGCCTG 
CAATCAAGAA 
ACTGTTCGCA 



TCCTACGGCA 
AAAAGTTGCA 
CTCAGAGAAT 
TGAGACATTT 
GAGAGAAAAT 
CTGAATCCAA 
AGAATGTTTT 
ATAAGGAAAG 
TGAGAACTGG 
AGTTGCGGTA 
TAATAAGAAG 
CTCTTGGGGA 
TTGAAAGTGA 
CAATGGATCT 
AGGAACAATT 
CAGAGTTGCG 
ATTTGGCAAG 
CCAGTTCATC 
GGGAAATGGA 
AGATTGCAGT 
CTCCTCCTCC 
ATGATGCTCG 
CCAACAGCAG 
TAGGACGTTT 
AGACTGAAGC 
AGGATCGAAG 
CTTTTGCCCA 
CTGCGTGGTA 
TATCTGACAC 
TTCGATTAGC 
GAACTCCTTC 
CAAAAACGAA 
TGGCTTATGG 
TACCTCAGTA 
TAACAAAAAA 
TACAATATGG 
GACGGGAAGC 
TTCGCTGGAT 
TGCATGGCTC 
TACAGATTCC 
TCTTGGCCCT 
GATCAACCTG 
GGTCCTCAGA 
AAATGACAAC 
CATACTCATQ 



GATGGAAGAG 
GCAGACCATG 
TGCAGCCCTA 
AGAGGGTCAA 
GAATGAGGAG 
TGAACGCCTA 
AATTCAAGAA 
ATTAGCAGAA 
CTCTTTAATT 
CTCAGTGGGA 
ACCAAGGAGA 
TCACGAGTGG 
CACTGAAATG 
TCTCTCTCCA 
GGATGCCATC 
TGCTGAAGAA 
GGTCCACCCA 
TCCCCCCAGT 
TCGGATGGGA 
TGTGGAAGAA 
TACCCCTAGA 
AAGTAGTTTA 
CCAAGACTCT 
GTTTGGTAAA 
TGCAGCTCAG 
ACTAAAGAAA 
GTGGGATGGG 
CGTGGCAGCC 
TGAGATCCAG 
AATCCAGGAG 
AGGCAACGTT 
AGAATCTGAG 
AGATATGAAT 
CAGAAGTTAC 
AGATCTCCGT 
AATTATGTGC 
AAGCCAACAT 
ACAAGCAATT 
ACTTATAGCC 
AACACAGAAC 
GGGAACTGAA 
GAGAAGGCAG 
AACATTACCA 
AGATGTTGCT 
TCTCGAGTAA 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 



50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:238 PM28 Protein sequence: 
Protein Accession #: none found 



MMCEVHPTIN 
QRLQDVTYDR 
EEBIS ELKAE 
FEKHKALDBK 
HLEGMEPGQK 
VEQEAETARK 
NDKLENELAN 
TKAEERHGNI 
QLHLKERMAA 
EPTIPRTHLD 
NRTQQIGVLS 
NKEIRLIQEE 
GHSTPKLTPR 
ALRMTHTLPS 
KEKARLGQLR 
PTWAWLELW 
MVSLTSPSAP 
KEWIGNEWLF 
LKRLNYDRKE 
LDENFDYSSL 
FPPREVHGIS 



11 
I 

EDTFMSQRGS 
DSLORQLNSA 
RNNTRLLLEH 
VRERLRVSLE 
VHBKRLSNGS 
DLIKTEEMNT 
KEAILROMEE 
EERMRHLEGQ 
LEEKNVLIQE 
TSAELRYSVG 
SHPFESDTEM 
KESTELRAEE 
SPAREMDRMG 
SYHNDARSSL 
GFMETEAAAQ 
LGMPAWYVAA 
PTSRTPSGNV 
SLGCiPQYRSY 
LERRREASQH 
TLLLQIPTQM 
MMPGSSETLP 



21 
I 

QSSGSDSDSH 
LPQDIESLTG 
LECLVSRHER 
RVSALEEELA 
IDSTDETSQI 
KYQRDIREAM 
KNRQLQERLE 
LEEKNQELOR 
SETPRKNLEE 
SLVDSQSDYR 
SDIDDDDRET 
IENRVASVSL 
VMTLPSDLRK 
SVSLEPESLG 
ESLGLGKLGT 
CRANVKSGAI 
WVTHEEMENL 
FMECLVDARM 
EIKDVLVWSN 
TQARQILERE 
AGFRLTTTSG 



31 
I 

FEQLMVNMLD 
GLAGSKGADP 
SLRMTWKRQ 
AANQEIVALR 
VELQELLEKQ 
AQKEDMEERI 
LAEQKLQQTM 
ARQREKMNEE 
SLHDKERLAE 
TTKVIRRPRR 
IF5SHDLLSP 
EGLNLARVHP 
HRRKIAWEE 
liGSANSSQDS 
QAEKDRRLKK 
MSALSDTEIQ 
AAPAKTKESE 
LDHLTKKDLR 
DRIIRWXQAI 
YNNLLALGTE 
QSRKJTTTDVA 



41 

i 

ERDRLLDTLR 
PEFAALTKEL 
AQSPSGVSSE 
EQNVHIQRKM 
NYEMAQMKER 
TTLEKRYLSA 
RKAETLPEVE 
HNKRLSDTVD 
EIBKLRSELD 
GRMGVRRDEP 
SGHSDAQTLA 
GTSITASVTA 
DGREDKATIK 
LHKAPKKKGI 
KHELLEEARR 
REIGISNPLH 
EGSWAQCPVF 
VHLKMVDSFH 
GLREYANNIL 
RRLDESDDKN 
SSRLQRLDNS 



51 

I 

ETQESLSLAQ 
NACREQLLEK 
VEVLKAUCSL 
ASSEGSTESE 
LAALSSRVGE 
QRESTSIHDM 
AELAQRIAAb 
RLLTESNERL 
QLKHRTGSLI 
KVKSLGDHEW 
MMLQEQLDAI 
SSLASSSPPS 
CETSPPPTPR 
KSSZGRLFGK 
KGLPFAQWDG 
RLKLRLAIQE 
LQTLAYGDMN 
RTSLQYGIMC 
ESGVHGSLIA 
PRRGSTWRRQ 
TVRTYSCLE 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



Nucleic Acid Accession*; 
Coding sequence: 



SEQ ID NO:239 PCM DMA SEQUENCE 

NMJM6570 

1- 1 134 (underlined sequences correspond to start and stop codons) 
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1 11 21 

I I I 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA 
AAGGTTCCTG AGAGCTATGT AGAGACTTCA 
TTTACAACTA TGGCTTTATT AACCATAATG 
AAGTATGAAT ACGAAGTAGA CAAGGATTTT 
ACTGTTGCCA TGAAGTGTCA ATATGTTGGA 
GTTGCATCTG CAGATGGTTT AGTTTATGAA 
AAAGAGTGGC AGAGGATGCT GCAGCTGATT 
CAAGATGTGA TATTTAAAAG TGCTTTTAAA 
GATGATTCAT CACAGTCTCC AAATGCATGC 
GTAGCAGGGA ATTTTCACAT AACAGTGGGC 
CATTTGGCAG CACTTGTCAA CCATGAATCT 
TCTTTTGGAG AGCTTGTTCC AGCAATTATT 
ATAGATCACA ACCAGATGTT CCAATATTTT 
TATAAAATAT CAGCAGACAC CCATCAGTTT 
CATGCTGCAG GCAGCCATGG AGTCTCTGGG 
ATGGTGACAG TTACTGAGGA GCACATGCCA 
ATTGTTGGAG GAATCTTTTC AACAACAGGC 
GAAATAATTT GCTGTCGTTT CAGACTTGGA 
GAGGATGGCC ACACAGACAA CCACTTACCT 



31 41 51 

I I f 

AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

GAATTCTCAG TATATCAAGA TACATGGATG 180 

TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

CAQAGTAGGC TACAAGAAGA GCATTCACTT 420 

AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

AGAATTCATG GCCATCTATA TGTCAATAAA 540 

AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

TACAATTTTT CTCATAGAAT AGATCATTTG 660 

AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

ATATTTATGA AATATGATCT CAGTTCTCTT 900 

TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
CTTTTAGAAA ATAATACACA TTGA 



gEftiP NQ;24P PCI4 Prcteln sequence; 
Protein Accession «: NP.0S7654 

1 11 21 31 41 51 

I 1 I I I I 

MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 
KYEYEVDKDF SSKLRINIDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 
KBWQRMLQLI QSRLQEEHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 
VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 
IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERIIN HAAGSHGVSG ZFHKYDLSSL 300 
MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKFVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEQ ID NO:241 PBA7 DNA SEQUENCE 

Nudeic Add Accession*: AA218134 

Coding sequence: 24-1815 (underlined sequences correspond to start and stop codons) 



AATTCGCCCT TGCTTAATTA AG CATGT TTA CCTTCCTGTC ATCTGTCACT GCTGCTGTCA 60 
GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCTT CAGATCAAAA 120 
CCTTATTAGC CCTGAGCTGC CATGAGCAGG AAATGGTTGT GAGCTCCCTC GTCATTGGAG 180 
CCCTCCTTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 
TCATCTTGTC ATCCTGCCTG CTTGGACTCG G AAGCTTAGT CTTG ATCCTC AOTTTATCCT 300 
ACACGGTTCT TATAGTGGGA CGCATTGCCA TAGGGGTTTC CATCTCXCTC TCTTCCATTG 360 
CCACTTGTGT TTACATCGCA GAGATTGCTC CTCAACACAG AAGAGGCCTT CTTGTGTCAC 420 
TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 
CCAATGTTTT CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTCCCTTG GGAGTTTTGC 540 
AAGCAATTGC AATGTATTTT CTTCCTCCAA GCCCTOGGTT TCTGGTG ATG AAAGG ACAAG 600 
AGGGAGCTGC TAGCAAGGTT CTTGGAAGGT TAAGAGCACT CTCAGATACA ACTGAGGAAC 660 
TCACTGTGAT CAAATCCTCC CTGAAAGATG AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
GTTCAAAAGA CAACATGCGG ACCCGAATAA TGATAGGACT AACACTAGTA TTnTTGTAC 780 
AAATCACTGG CCAACCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 
TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT OCACTGGGGT TGGAGTCGTC AAGGTCATTA 900 
GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CTCTGCATTG 960 
GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 
TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1140 
GGATTTCTTC CCATAGCAGA AGCTCACTCA TGCCCCTGAG AAATGATGTG GATAAGAGAG 1200 
GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 
TCACAGACCC TGGGGACGTC CCAGCTTTTT TGAAATGGCT GTCCTTAGCC AGCTTGCTTG 1320 
TTTATGTTGC TGCTTTTTCA ATTGGTCTAG G ACCAATGCC CTGGCTGGTG CTCAGCGAG A 1380 
TCTTTCCTGG TGGGATCAGA GG ACGAGCCA TGGCTTTAAC TTCTAGCATG AACTGGGGCA 1440 
TCAATCTCCT CATCTCGCTG ACATTTTTGA CTGTAACTGA TCTTATTGGC CTGCCATGGG 1500 
TGTGCTTTAT ATATACAATC ATGAGTCTAG ATCTTATTGG OCTGCC ATGG GTGTGCTTTA 1560 
TATATACAAT CATGAGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
AGGGATGCTC TTTGGAACAA ATATCAATGG AGCTAGC AAA AGTGAACTAT GTGAAAAACA 1680 
ACATTTGTTT TATGAGTCAT CACCAAGAAG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 
AACCCCAGGA GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 
TTTCTCCAGA GACCTAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA 1 860 
GGAGGGTGTC TTTGG ACCAA TGCATAGTTG CGACTCCTGT OCTCTCnTT CAGTGTCATG 1920 
GAACTGGTTT TGAAGAG ACA CTCTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 
CAGAAGGAAC CTCAAAAGGT AGATGAGGTA CAAGGTCCTA AGTGATCTCT TTTTCTGAGC 2040 
AGGATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA CCTTCTTCAC 2100 
AGAGCAGCCT TTGAATAGAC TATGTCCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 
TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
GTACAGTTTC TGCCTACCAA GACACTACTT GCACTGG ATC TTACGCAAAA AAGAACCAG A 2280 

402 
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ACACACAGTG TGGACAACTG CCCATATATT CTATCTAGAT TAGG AG AGGG TCCTGGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAAAT ATCCTTTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTCCR CAATAGTTGA GAAAAATGAG C ATTTTTTTC CATTTTTAAA 2520 
AAATGCATAG AAAAGACAAT TTTAAAATCC TGGGACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTA AGCAATT 2640 
AGGTTGAAGT TATTAAGTCA AGCCTAG AAA AGCTGCCTCC TTGTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TCCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTACAC AGGCTATAAT TGATGATGAT GTTCAGATAA CTGAAGACAC AATAAATGAC 2820 
ATTCAGACAT CAGGAMAA WW CCCTCATGTT CTTTTCTATG ATGGCCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATOATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA ACATATTAAA TTACTAATAT TTAAGTG ATG GAAGACACAC AAAAAACTTA 3060 
AAAGCACGAA CAACCTAACT TGAAAAAGAA TTTTAAAATA TGATTAACCT GAAG AAAAGA 3 120 
GAATCCTAAG AGCCAAAGCT CCTTTTTATT TAGCTTGGAA TTTTCCTATT GGTTCCTAAC 31 80 
AAACTGTCCC AATGTCATAT AAGG AAACAT GATCTATTAC ATTCCTTTAT AAC AATGTGG 3240 
AGAGACTATA AACCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACTGACTAAA 3300 
AGGCCTGGAT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGGAAGC TAAAAGGAAA 3360 
GGAGATTGGA GATCTCAATT CTATXATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 
GTTTTTTGTT TTTGGAAAGA GAAGGGAAGT GTGTTCTGCC CCATGTTTCC TTCCGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTTTTGTTT AGCCCITCAT TATAAATGGG 3540 
CAGGAAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG ACCTTGACTT CAGTATCTTG 3600 
AGCATTCTTT TATATTTTTC TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAATTTTTAT CCAATATCCA TTGCACCACA CCAAATCAAG CTTCTTGATT TTCAAAAATA 3720 
AAAAGGGGG A AATACTTACA ACTTGTACAT ATATATTCAC AGTTTTTATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGG AG A GCCAGCAGAA G ACATCAGAG CACTCACTIC TTCCCATCTT 3840 
TGTTAAGGTT AGCGAATTAC CCATGGACAC TGTTAGGTGA GGCTCATTCG GCAGCCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTCAG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTGGATCTG 4020 
TACCTTGGCT ATATAAGCAT Ori ' llC OCCC TATTCTATGT TTCTTTTTTT GGTGAACATT 408 0 
GAAAAACAGG AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATCCTA 4200 
CTGAATCTTA AAATAAAGGA AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWAAAAATCT CAATGAAATA TTTCACAAGA AGGAAAAA 



SEQ ID flO:242 PBA7 Protein seouence : 

Protein Accession #: AAF91431 

MFTFLSSVTA AVSGLLVGYE LGIISOALLQ IKTLLALSCH EQEMWSSLV IGA1XASLTG 60 
GVLIDRYGRR TAHLSSCLL GLGSLVULS LSYTVLTVGR IAIGVSISLS SIATCVYIAE 120 
IAPQHRRGLL VSLNELMTVI GILSAYISNY AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVIKSSL KDEYQYSFWD LFRSKDNMRT 240 
RIMIGLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGWK VISTtPATLL 300 
VDHVGSKTFL CIGSSVMAAS LVTMGIVNLN IHMNFIHICR SHNSINQSLD ESVIYGPGNL 360 
STNNNTLRDH FKGISSKSRS SLMFLRNDVD KRGETTSASL LNAGLSHTEY QIVTDPGDVP 420 
APLKWLSLAS LLVYVAAFSI GLGPMPWLVL SEIFPGGIRG RAMALTS5MN WGINLUSLT 480 
FLTVTDLIGL PWVCFIYTIM SLDLXGLFWV CFIYTIMSLA SLLFWMFIP ETKGCSLEQ1 540 
SMELAKVNYV KNNICFMSHH QEELVPKQPQ KRKPQEQLLB CNKLCGRGQS RQLSPET 



SEQ ID NO:243 PAB4 DMA sequence: 
Nucleic Acid Accession*: AA1 72056 

Cooing sequence: 121-339 (underlined sequences correspond to start and stop codons) 



TTTAGCCACC AGAGG ANTTC TCTTGAAATA CCCAAAATCC ATCAGTATCT TGAATCATGC 60 
TGGATnTGA AGAATTCTTA AGAAGCCATG TAAAGGGGGC TCTCTGGCCT TGAAATAGTG 120 
ATO TTTTTTA TACAGAAAGG AGAATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 180 
GATITCAAGA AATTACAGGA AAACTTTCCA AAGTTCCATC TCAC AGAANN TTATTTTNCC 240 
AAGAATTCCA AG ATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAG AC ATCGACAGAT GATTACATCA CTTATAGTTC TAGTAAATTT ATTAATATAA 360 
AACTCAGAGA C ATTCCAATA TCCACATTGC TTACACCATT AGGCATAGAT TCAGTGTCAG 420 
CTATGACAAT TGAAAATGAG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAACCAAAC 480 
TGCTTG ATCC AGATGCAGGA CTGCAAATGT TAATATTTGT T CTGOAA GAA CAATCAAATA 540 
AGACTTAAGA GGAAAGGGAA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTGTGC 600 
AGCCTACTAA ATCAG AATG A AAATAG AAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TGAAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GA ATTTAT GC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTACCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCOTT 900 
CTTGGTTTTT TATTTGGAGA GTCTGTGCAA AATGTCACTA AAAATAAATT AGCACTAGAA 960 
ATT ATTTCT A AATACCAAA 

SEQ ID N0244 PBQ8 DNA SEQUENCE 

Nucleic Acid Accession*: X51405 

Coding sequence: 3-1721 (underlined sequence corresponds to start and stop codon) 
1 11 21 31 41 51 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



AAATGGCQTG 
CCTGGGCTCC 
GTGGCCCCAG 
GGTGCGGAAC 
AAGAGGCCGC 
GAGGGGGCAG 
GCGCCGAAGC 
AGCAAGAGGA 
TGTCCGTGTG 
AGGGCCGGGA 
AGCCTGAATT 
TCATTTTCTT 
ACCTGATCCA 
AGGCAGCGTC 
GAATAGATCT 
AAGGTGGTCC 
AGCTTGCTCC 
CTGCCAATCT 
GTAGTGCTCA 
CATACTCTTC 
ATGATGACAG 
GAGGGATGCA 
GCTGTGAGAA 
CCCTCATTAG 
AAGGTAACCC 
CCGCAAAGGA 
CAGCTCCAGG 
GGGTTGATTT 
TGGAATGGTG 
CTTTAAATCT 
CAGTTAATAC 
AAATAAATAG 
TATTCATTTT 
ATCCTAGGCT 
TCTAGCTTTC 
AATGCTATTG 
TAAATAGTTC 
TGTTAATGCA 
AATAAAAATT 
TTAACACTAC 
CTGAATGAAT 



CCCGTCTCTC 
GCGGCCAGTA 
TGCGCGGGCT 
TTGCCGCCCC 
CCGCGTAGGA 
CGCGCTGCTG 
CCAGGAGCCC 
CGGCATCTCC 
GCTGCAGTGC 
GCTCCTGGTC 
TAAATACATT 
GGCCCAGTAC 
CAGTACCCGC 
TCAGCCTGGT 
GAACCGGAAC 
AAATAATCAT 
TGAGACCAAG 
CCATGGAGGA 
CGAATACAGC 
TTTCAACCCG 
CAGCTTTGTA 
AGACTTCAAT 
GTTCCCACCT 
CTACCTTGAG 
AATTGCGAAT 
TGGTGATTAC 
CTATCTGGCA 
TGAACTGGAG 
GAAAATGATG 
ATCTATATAA 
TTAACATTGA 
CCTCTTAGGT 
CCTACCTATA 
TAAATGCAAT 
AAAAATTAGT 
AAAAGGTTAA 
AGTATAAATT 
TTTTTGATGG 
GACTTCTTGC 
TTAAAAGTTT 
AAAGGTTAAA 



CGOCGGCCCC 
GTGCAGCCCG 
GACACTCATT 
CAGCAGCGCC 
AGGCACCGCC 
GCTCTGTGCG 
GGGGCGCCCG 
TTCGAGTACC 
ACCGCCATCA 
ATCGAGCTGT 
GGGAATATGC 
CTATGCAACG 
ATTCACATCA 
GAACTCAAGQ 
TTTCCAGACC 
CTGTTGAAAA 
GCTGTCATTC 
GACCTTGTGG 
TCCTCCCCAG 
GCCATGTCTG 
GATGGAACCA 
TACCTTAGCA 
GAAGAGACTC 
CAGATACACC 
GCCACCATCT 
TGGAGATTGC 
ATAACAAAGA 
TCATTTTCTG 
TCAGAAACTT 
TGTAGTATGA 
TTTATTTTTT 
AAAAATATAA 
TTACACAAAA 
ATTCCTGGTA 
GAAGTTCTTT 
CAGATACAGC 
GTCGTTTTTT 
GAAGAAAAGG 
TTGTACATAT 
AGGGTTTTCT 
AAAAAATCCC 



CTGCCTCGCA 
TGGAGCCGCG 
CAGCCGGGGA 
GGCGGGCTAA 
GGCGGCGGCG 
GGGCACTGGC 
CGGCGGGCAT 
ACCGCTACCC 
GCAGGATTTA 
CCGACAACCC 
ATGGGAATGA 
AATACCAGAA 
TGCCTTCCCT 
ACTGGTTTGT 
TGGATAGGAT 
ATATGAAGAA 
ATTGGATTAT 
CCAATTATOC 
ATGACGCCAT 
ACCCCAATCG 
CCAACGGTGG 
GCAACTGTTT 
TGAAGACCTA 
GAGGAGTTAA 
CCGTGGAAGG 
TTATACCTGG 
AAGTGGCAGT 
AAAGGAAAGA 
TAAATTT TTA 
TGTAATGTGG 
AATCATTTAA 
GAACTTGATA 
AAGTATAGAA 
TTATTTACAA 
TACTGTAATT 
TCGGAGTTGT 
TCTTGTGCTG 
TACATGTTTA 
AGGAGCAATA 
CTTGGTTGTA 
CAGTGAAAAA 



I 

GTGGTTTCTC 
GCTTTGCCCG 
AGGTGAGGCG 
GCCCAGGGCC 
GAGCGCAGCG 
TGCCTGCGGG 
GAGGCGGCGC 
CGAGCTGCGC 
CACGGTGGGG 
TGGCGTCCAT 
GGCTGTTGGA 
GGGGAACGAG 
GAACCCAGAT 
GGGTCGAAGC 
AGTGTACQTG 
AATTGTGGAT 
GGATATTOCT 
ATATGATGAG 
TTTCCAAAGC 
GCCACCATGT 
TGCTTGGTAC 
TGAGATCACC 
CTGGGAGGAT 
AGGATTTGTC 
AATAGACCAC 
AAACTATAAA 
TCCTTACAGC 
AGAGGAGAAG 
AAAAGGCTTC 
TCTTTTTTTT 
ATATTAATCA 
TATTTCATTC 
AAGATTTAAG 
TGCAGAATTT 
GGTGACAATG 
GAGCACTCTA 
ACTAACTATA 
CAAAGAGGTT 
CTATTATATT 
GAGTGGCCCA 
AAA 



CTGCAGCTCC 
TCTCCTCTGG 
AGTAGAGGCT 
GGGCAGACAA 
ATGGCCGGGC 
TGGCTCCTGG 
CGGCGGCTGC 
GAGGCGCTCG 
CGCAGCTTCG 
GAGCCTGGTG 
CGAGAACTGC 
ACAATTGTCA 
GGCTTTGAGA 
AATGCCCAGG 
AATGAGAAAG 
CAAAACACAA 
TTTGTGCTTT 
ACGCGGAGTG 
TTGGCCCGGG 
CGCAAGAATG 
AGCGTACCTG 
GTGGAGCTTA 
AACAAAAACT 
CGAGACCTTC 
GATGTTACAT 
CTTACAGCCT 
CCTGCTGCTG 
GAAGAATTGA 
TAGTTAGCTG 
AGATTTTGTG 
ACTTTCCTTA 
TCTTATATAG 
TAATTTTGCC 
TTTGAGTAAT 
TCACATAATG 
CTGCAAGACT 
AGCATGATCT 
TTATGAAAAG 
ATGTAGTCCG 
GAATTGCATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



Protein Accession*: 



SEQ tD NO:245 PBQ8 Protein sequence 
P16870 



MAGRGGSALL ALCGALAACG W1XGAEAQEP GAPAAGMRRR RRLQQEDGIS FBYHRYPELR 60 
BALVSVWLQC TAISRIYTVG R5FEGRELLV ELSDNPGVH EPGEPEFKYI GNMHGNEAVG 120 
RELUFLAQY LCNEYQKGNE TIVNUHSTR IHIMPSLNPD GFEKAASQPG ELKDWFVGRS 180 
NAQGIDLNRN FPDLDRrVYV NEKEGGPNNH LLKNMKKTVD QNTKLAPBTK AVIHWIMDIP 240 
FVLSANLHGG DLVANYPYDE TRSGSAHEYS SSPDDAIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGGAWY SVPGGMQDFN YLSSNCFETT VELSCEKFPP EETLKTYWED 360 
NKNSUSYLE QIHRGVKGFV RDLQGNPIAN ATB VEGIDH DVTSAKDGDY WRLLIPGNYK 420 
LTASAPGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



SEQ ID NO:246 PBY4 DMA sequence 
NudefcAridAccesslonfc AF038966 

Cooing sequence: 



81-11 07 (underlined sequence corresponds to start and stop codon) 



l 

I 

GGGGCGACGT 
GTCGGGTGGG 
GACCCGGATC 
CCACCAGGAC 
GTGAAGATGC 
CCAGCTTATA 
CAAGAAGAAC 
CTCAGTCAAC 
CCTTGTTTCT 
CTTATGTACT 
TTGGCTTGGT 
TTCTTGCTTT 
AGGAGTGACA 
GTACATGTAC 
CTTACTGGTC 
TTCACAGCAT 
ACAACAGGTG 
AAAACTGTCC 



11 

I 

GAGCGCGCAG 
TGACGCCGAG 
TCAACAATCC 
TTGATGAATA 
CTAATGTACC 
CACAGATTGC 
TAGAAAGAAA 
ATGGTAGAAA 
ATCAGGAATT 
ACTTGTGGAT 
TTTGTGTTGA 
TTACTCCTTG 
GTTCATTTAG 
TCCAAGCTCC 
TCAACCAAAA 
CAGCAGTCAT 
CTAGTTTTGA 
AGACCGCAGC 



21 

I 

GGGGGCGGCG 
AGCCAGAGAG 
CTTCAAGGAT 
TAATCCATTC 
CAATACACAA 
AAAGGAACAT 
AGCCGCAGAA 
AAATATTTGG 
TTCTQTAGAC 
GTTCCATGCA 
TTCTGCAAGA 
TTCATTTGTC 
ATTCTTTGTA 
AGGATTTCAT 
TATTCCTGTT 
CTCACTAGTT 
GAAGGCCCAA 
TGCAAATGCA 



31 
I 

GCCTCGCCTC 
ATGTCGGATT 
CCATCAGTTA 
TCGGATTCTA 
CCAGCAATAA 
GCATTGGCCC 
TTAGATCGTC 
CCACCTCTTC 
ATTCCTGTAG 
GTAACACTGT 
GCGGTTGATT 
TGTTGGTACA 
TTCTTCTTCG 
AACTGGGGCA 
GGAATCATGA 
ATGTTCAAAA 
CAGGAGTTTG 
GCTTCAACTG 



41 
I 

GTCTCTCTCT 
TCGACAGTAA 
CACAAGTGAC 
GAACACCTCC 
TGAAACCAAC 
AAGCTGAACT 
GGGAACGAGA 
CTAGCAATTT 
AATTCCAAAA 
TTCTAAATAT 
TTGGATTGAG 
GACCACTTTA 
TCTATATTTG 
ATTGTGGTTG 
TGATAATCAT 
AAGTACATGG 
CAACAGGTGT 
CAGCATCTAG 



51 

I 

CTGCGCCTGG 
CCCGTTTGCC 
AAGAAATGTT 
ACCAGGCGGT 
AGAGGAACAT 
TCTTAAGCGC 
AATGCAAAAC 
TCCTGTCGGA 
GACAGTAAAG 
CTTCGGATGC 
TATCCTGTGG 
TGGAGCTTTC 
TCAGTTTGCT 
GATTTCATCC 
AGCAGCACTT 
ACTATATCGC 
GATGTCCAAC 
TGCAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



404 
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AATGCTTTCA AG GGTAACCA GATTTAAGAA TCTTCAAACA ATACACTGTT ACCTTTT6AC 1140 

TGTACCTTTT TCTCCAGTTA CT6TATTCTA CAAATATTTT TATGTTCAAA ACACACAGTA 1200 

CA6ACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

GTCTTATTAC TTTACCTAAT AGTTTCTTAA TATTTCAGTG CCCCTTGCAO AAAAAATATT 1320 

ACATGCTAAA TAAATATTCT CCATATTTTT GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1380 

GGTGACCCAC TGAAAATTAA TAATGGTACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 

CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCAC ATTGGAAAAG TAAACCATGT 1500 

TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATA6GCC AGCAGAGACT TAGGGATTTT 1560 

AAATTGGCTT GCTTTTTAGC TGTTTCAGTC ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 

TAGATAATGT AAAATTTGTC ATCTTTTTCT TTTCTTTTTT TTAGAATAGC TGATATTTTG 1680 

ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

GCTTCTGTAC TGCTTATGGT TGTAGGATTC AGGGGTTAAT GGAATCACAG AAATGATATT 1800 

CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT GGGGGTGCAA TATAAGAAGT TTATATAATA 1860 

TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 
CTTTTT 



5EQ ID NO: 247 PBY4 Protein sequence: 

Protein Accession #: 

MSDFDSNPFA DPDLNNPFKD PSVTQVTRNV PPGLDEYNPF SDSRTFPPGG VKMPNVFNTQ 60 
PAJMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIFGC LAWFCVDSAR 180 
AVDFGLS1LW FLLFTPCSFV CWYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMIIIAAL FTASAVISLV MFKKVHGLYR TTGASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



$EQ |D NQ:248 PB H2 PNA segpence 

Nucleic Acid Accession*: none found 

Coding sequence: 1-613 (undefined sequence corresponds to start and stop codon) 



ATGAGAGACA ATAAATCGTG TGCTTTTTTC ATGGGAAAGT TAAATGTTTG TTTTGAAGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACTAC AATTTCCTAA AAAGTCTTCT CACCCTCACA GGACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAG A AGTAGTAAAA CTCCTGCTGO ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT G ATCCG AATA TTCCAGATGA GTATGG AAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCG AATC A AAAAACAAG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTAATCAAGA AAAAAGCAAA TTTAAATGCA 540 
CTGGATAGAT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



SEQ ID MO:249 PBH2 Protein sequence: 
Protein Accession & none found 

MRDNKSCAFF MGKLNVCFEG TVIAGYS VFA TTCUHLAVA SALQFFKKSS HPHRTALHLA 60 
SANGNSEVVK LLLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YGADIESKNK HGLTPLLLG V HEQKQQWKF UKKKANLNA 180 
LDRYGRCVTL GTLFTTKYW IYEK 



SEQ ID NO:250 PBJ1 PNA sequence 
Nucleic Acid Accessions XM.005829 

Coding sequence: 1 -3043 (underlined sequence corresponds to start and stop codon) 

ATG GTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 120 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTGAAGCCA GAGACGAAGG CACCGACAGT 180 
TTATTATTAA ACAATGGCAG CAGCGCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 
ATCAGGAGCA GATTTGAAGA ATTACAA AGT G AATTGGTGC CAGTCAGCAT GTCAGAGACA 360 
GACCACATAG CCTCTACTTC CTCTGATAAA AATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCAAAT TAGAA AATGA GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GGAGGTGAGG ATTCTTGTGC CAAAACAGAC 600 
ACAGGCTCAG AAAATTCTGA ACA AATAGCT AATTTTCCTA GTGGAAATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGAACAGAAA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 
AATA AG GO AG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGGAACATA TCATAAAAAA GTTAATTAAA GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA GAAGAACTAA AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAATAAAGAA 1 140 
GTTAAAGCTT CCAGAGATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 

405 



WO 02/30268 



PCT/US01/32045 



CAGTTACACA AAGAGATGGC CCAACGGATG GAACAGGCCA ACAAGAAATG TGAAGAGGCA 1260 
CGCCAAGAAA AAGAAGCAAT GGTAATGAAA TATGTAAGAG GTGAGAAGGA ATCTTTAGAT 1320 
CTTCOAAAGG AAAAAGAGAC ACTT GAGA AA AAACTTAGAG ATGCAAATAA GGAACTTGAG 1380 
c AAA AACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GACGGTTGCA CCAGCTGTAT 1440 
5 GAAACTAAGG AAGGCGAAAC GACTAGACTC ATCAGAGAAA TAGACAAATT AAAGGAAGAC 1500 
ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
GAAGAAGCAG ATCAGATACG AAAAAACTGT CAGGATATGA TAAAAACATA TCAGGAGTCA 1680 
A GAAGAAATTA AATCAAATGA GCTTGATGCA AAGCTTAGAG TCACAAAAGG AGAACTTGAA 1740 

10 AAACAAATGC AAGAAAAATC TGACCAGCTA GAGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
GAAGATCTGA AGAGAACATT TAAGGAGGGT ATGGATGAGT TAAGAACACT GAGAACAAAG 1860 
GTGAAATGTC TAGAAGATGA ACG ATTAAGA ACAGAAGATG AATTATCAAA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
- CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTG AAAATTTGAA AGAAGAAGTG 2040 

1 5 GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGA AAGG CTCACTAGTA AGAATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTTGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTGAAAG TCAGTTACAA 2220 
AGOCAGTGTG AACAAATGAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTGAAAGAG 2280 
GAAOAACTGC GAAAAGAGGA AGTCCAAACT CTGCAAGCTG AACTCGCTTG TAGACAAACA 2340 

20 GAAGTTAAAG CATTGAGTAC CCAGGTAGAA GAATTAAAAG ATGAGTTAGT AACTCAGAGA 2400 
CGTAAACATG CCTCTAGTAT CAAGGATCTC ACCAAACAAC TTCAGCAAGC ACG AAG AAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATGAC AAAG AAGTCA GCAGC ATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGCAGAAG ATCGATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGGA TAACTTTCCA CAAGTAGATA AGGCCATGTT G ATTGAGAGA 2640 

25 ATAGTTAGGC TGCAAAAAGC ACATGCCCGG AAAAATGAAA AOATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTACGAGAAG AATCAGGCAC ACTTTCTTCA GAGGCATCTG ATTTTAACAA AGTTCAnTA 2820 
AGTAGACGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATCCAGCTGA CAATGGATTA 2880 
ACATTGGAGC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGGA GGATACGTTA 2940 

30 CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG GAACAGAAAT AGAACGTCTT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTCCACTT TTTGTTTCAG CCAGTAAAAA TATTGTTTTG CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT ACAATATG AA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 

35 AATTTGTTTT TGTATGOTGC AATATGACAG CCTGTCATTG AATCTAAACA ACTTAATTTG 3300 
CTTGTATTCA f AAG AAGTGT TGAACATTAC AAGGGCllll AT 

Ar . SEP ID NO:251 PBJ1 Protein senuflnce: 

40 Protein Accession #: NP.060487 

MVHYLSFCN YYMEFYREEL PHIDYLIDIQ FATGKVTQPG EDTSYHQCAQ LEARDEGTDS 60 
LLLNNGSSAT LKTRTRCYGT PRGLPHRSLL QPTPPTCKTK IRSRFEELQS ELVPVSMSBT 120 
AC DHIASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QPNEHNNRDE 180 
45 AQENYIPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAKH ISKTNETEQK VTQDLVELRS 240 
5TFPE5ANEK TYSESPYDTD CTKKFISKIK SVS ASEDLLE EIESELLSTE FAEHRVFNGM 300 
NKGEHALVLF EKCVQDKYLQ QEHUKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESRIEELNKE VKASRDQLIA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
_ rt RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KNTNKIKQLS QHKGRLHQLY 480 
50 ETKEGETTRL IREIDKLKED [NSHVIKVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDMIKTYQES EEDCSNELDA KLRVTKGELE KQMQEKSDQL EMHHAKKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE IINRQKAEIQ NLLDKVKTAD 660 
QLQEQLQRGK QEIENLKEBV GSLNSUNDL QKDEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLESRLLKE EELRKEEVQT LQAELACRQT 780 
55 EVKALSTQVE ELKDELVTQR RKHASSIKDL TKQLQQARRK LDQVESGS YD KEVSSMGSRS 840 
SSSGSLNARS SAEDRSPENT GSSVAVDNFP QVDKAMLER tVRLQKAHAR KNEKIEFMED 900 
HIKQLVEEtR KKTKIIQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
Tf.FT.ST.ETNR KLQAVLEDTL LKNZTLKENL QTLGTETERL IKHQHELEQR TKKT 



60 
65 
70 
75 



SEQ fD WO:252 PBJB DNA sequence 
Nucleic Acfd Accession*: 083760 

Cotfng sequence: 55-1459 (undefined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I i i I 

TTGCCGTGAA GGGCTGTGCG GTTCCCGTGC 6CGCC6GAGC CTGCTGTGGC CTCTTATGCA 60 

CTCCACCACC CCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTGCT 120 

AGGCT6GAA6 CAAGGAGATO AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

GAAGAAGTTA AAGAAGAAGA AGGGAGCCAT GGACGAGCTG GAGAGGGCTC TCAGCTGCCC 240 

GGGGCAGCCC AGCAAATGCG TCACGATTCC CCGCTCGCTG GACGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GGCCTGCCCC ATGTGATTTA CTGTCOCGTG TGGCGCTGGC CGGATCTGCA 360 

GTCCCACCAC GAGCTGAAGC CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

AGAAGTGTGC ATTAACCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCTCCTGT 480 

GCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CGCCTCCCTG CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTOCA 600 

GCAGCCTCCG TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCOC AGTCCCCGTG 660 

CACGGCCAGC TAOCCTCACT CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

CTCAGTTGAC ACACCACCCC TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTGG 780 

406 
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CCAACCTGTA GATGCCACAG CTGATAGACA TGTAGTGCTA TCQATACCAA ATGGAGACTT 840 

TCQACCAOTT TGTTACGAGG AGCCCCAGCA CTGGTGCTCG GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCAGGCTTC CTCCCGAAGT GTGCTCATAG ATGGGTTCAC 960 

CGACCCTTCA AATAACAGGA ACAQATTCTO TCTTGGACTT CTTTCTAATO TAAACAGAAA 1020 

CTCAACGATA GAAAATACCA GGAGACATAT AGGAAAGGGT GTGCACTTGT ACTACGTCGG 1080 

GGGAGAGGTG TATGCCGAGT GCGTGAGTGA CAGCAGCATC TTTGTGCAGA GCCGGAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAG ATCCCCAGCG GCTGCAGCCT 1200 

CAAGGTCTTC AACAACCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAGTCGTG TATGAACTGA CCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

TCATGGGCCA CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGGCTCTC CACATAACCC 1440 
CATTTCTTCA GTGTCTTAAC AGTCATGTCT TAAGCTGCAT TTCCATAGGA T 



S EQ ID KO:253 PMPnffltl«MmC« 
Protein Accession #: NP.005696 

MHSTTPISSL FSFTSPAVKR LLGWKQGDEE EKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 120 
QKEVC1NPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RSASLHSEPL MPHNATYPDS 180 
FQQPPCSALP PSPSHAFSQS PCTAS YPHSP GSPSEPESPY QHSVDTPPLP YHATEASETQ 240 
SGQPVDATAD RHWLSIPNG DFRPVCYEEP QHWCS VAYYE LNNRVGETFQ ASSRSVUDG 300 
FTDPSNNRNR FCLGLLSNVN RNSTTENTRR HIGKGVHLYY VGGEVYAECY SDSSIFVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQSVHH GEEWYELTK MCTERMSFVK 420 
GWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NPISSVS 



SEQ ID HO:254 PBJB DNA sequence 
Nucleic Acid Accession*: AB04684 

Coding sequence: 472-4377 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTCCTG CCTTTTTCTT TTGCTTAAGG 60 

GATGGACAAG GAGCTGAGAT TTATGACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC TCTCTTTCTC GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATGTT CTGTCTTGTG CTTTTTCTTT TAGAAGCTAC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACTGGGGC 360 

TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTG TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATG 480 

GGGGATATGA AGACCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATGAAGCAG 600 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

ATCGTCAAGA ATGTTCGGAA CATTGACTCT TOCGAGGGCG GGGAGAAAGA CGGCCACAAC 720 

CCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTOA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTGAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA CGACGAGAAG 900 

ATTOAGGTGG ATGACCCCCC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCTGA AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGACAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTGAGCTCCG AGAAGAATGA CACCAGCCTC CCCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

TCCTCCAAGC TCTCGTCCTG CATCGCTGCC ATCGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAACC AGTGGCCAAT TCGAGGGAAT CCTCCCCGTT ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTGACAAG TCTCCTGAAT CCCAGAATCT CATCGACGGG 1440 

ACCAAAAAAC CATCCCTGAA GCAACCGGAT AGTCCCAGAA GCATCTCAAG TGAGAACAGC 1500 

AGCAAAGGAT CCCCGTCCTC TCCCGCAGGG TCCACACCAG CAATCCCCAA AGTCCGCATA 1560 

AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTGGAAA GAAACCTTCC GAGCAGACAG CGTCCGTGAT GGCCTCTGTG 1680 

ACATCCCTTC TGTCGTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCCCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTCGTGAC CAATGCAGTT TCCCCTGCAG AGCTCACCCC CAAACAGGTC 1800 

ACAATCAAGC CTGTGGCTAC TGCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

CAAGTCATTA ATTTGAAGCT CGCTAACAAC ACCACGGTGA AAGCCACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTGTCGTGG TGCCGGCATC CAGCCTGGCC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT CCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGCT 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCCCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCCGCCAATG CAGGGATCAC GTTACCGACG CGTGGGTACA AGTGCTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGCGT GCGCATCGAA 2400 

GTAACGTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATGCAG CCTCCTTTCC 2460 

CATGCCCGTG GGCATAAGGA GAAAGGGGTG GTAATGCAAT GCTCCCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT CCGTCAAGCA ATACTTCCAC TTCAACTTCC 2580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCACA CACACTGTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TCCTTCAAGC ACTCCCATCA CCCCAGCCAT GCCCCTAGAT 2700 

GAAGACCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 
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ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2880 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGAGTGTG GGGCCATCTG CAGGTCGGTG 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTGCATT GCAATGTTGT GTACTCTGAT GTGGCTGCTC TGAAGTCTCA CATTCAAGGT 3060 

TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT 1 GGCCAGTCCT 3540 

GGGTGGACGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACAOCCCT GCCGCCAGTG TGACAAGTCT 3660 

TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACCCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC OCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TGTTTCTTTA AAACAGAGTT CTTAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTGTACCTT TTTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



SEQ IP NOi255 P8J8 Protein sequence: 
Protein Accession*: BAB13455 

MKTPDFDDLL AAFDIPDM VT> PKAAIESGHD DHESHMKQNA HGEDDSHAPS SSDVGVSVIV 60 
KNVRNIDSSE GOEKDGHNPT GNGLHNOFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQFSPISSA EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 
KTOLSTSGNV EKNKAVKRET BASSINLS VY EPFKVRKAED KLKESSDKVL ENRVLDGKLS 240 
SEKNDTSUPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADKSP ESQNLIDGTK KFSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT 360 
IKTSSGEIKR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSPPRAPLQ 420 
S AWTNAVSP AELTPKQVTI KPVATAFLPV S AVKTAGSQV INLKLANNTT VKATVISAAS 480 
VQSASS AIIK AANAIQQQTV WPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQDCQAI INAAASQPPK KVSRVQWSS LQSSWEAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRS VRIEVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGVVM QCSHULKPV PADQMIVSPS SNTSTSTSTL QSPVG AGTHT VTKIQSGITG 720 
TVISAPSSTP ITPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHPQQ AADTSGQKTC 780 
TICQMLLPNQ CSYASHQRIH QHKSPYTCPE CGAICRS VHP QTHVTKNCLH YTRRVGFRCV 840 
HCNW YSDVA ALKSHIQGSH CEVFYKCPIC PMAFKSAPST HSHAYTQHPG DCIGEPKUY 900 
KCSMCDTVFT LQTLLYRHFD QHIENQKVS V FKCPDCSJLLY AQKQLMMDHI KSMHGTLK5I 960 
EGPPNLGINL PLS1KPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCWECDCLFM QRDVY1SHVR KEHOKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGIRKVY 1080 
ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETE1KED TKVPSPKRKL 1140 
EEPVLEFRPP RGAITQPLKK LKINVFKVHK CAVCGFTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
RECGLCYTSH VSLSRHLFTV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT KGMAF1KSKR MSSAEK 
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SEQ ID NQ:25fi PBM1 DNA seouenca 
Nucleic Acid Accession!: AF111B47 

Coding sequence: 58-1608 (underlined sequence corresponds to start and slopcodon) 

1 11 21 31 41 51 

I I I I I I 

TTTTCGTCGA CTCTTACCGG TT6GCTGGGC CAGCTGCGCC GCGGCTCACA GCTGACGATG 60 

GGGGACCOCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCCGCTC GGTGCCCACT 120 

AACAAGGTOT GTTTTGATTG TGGTGCCAAA AATCCCAGCT GGGCAAGCAT AACCTATGGA 180 

GTGTTCCTTT GCATTGATTG CTCAGGGTCC CACCGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTCCTTTTTT CATCAACATG GGTGTTCCAC CAATGACACC 360 

AATGCCAAGT ACAACAGTCG TGCTGCTCAG CTCTATAGGG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAGCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT GGTTCCACCT 480 

TTGTCCCCTC CACCAAAGGA GGAAGATTTT TTTGCCTCTC ACGTTTCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AATAGCAOAA CCATCTTCTT TAACATCAAG GCCTGTGGAA 600 

ACCACTTTGG AAAATAATGA AGGTGGACAA GAGCAAGGAC CAAGTGTGGA AGGTCTTAAT 660 

GTACCAACAA AGGCTACTTT AGAGGTATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAAGT TTGGGAGCTC AGAAACTGGC AAACACATGC 780 

TTTAATGAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTC 840 

GCCAAGGTGG TATCTAAAGA AGAATCAATT GTTTCATCAT TACGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAATGAAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCATGGG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

AATQATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 

GAGTTAAGGA GCAGTTCTTT CTCTAGCTGG GATGACAGTT CAGATTCCTA TTGGAAAAAA 1200 

GAGACCAGCA AAGATACTGA AACAGTTCTG AAAACCACAG GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGCCAGATTA TGAGCCAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATCAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

GAGACCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCATAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGGG AACTACAGCC TGTCCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TTGCTGQAAA ACTCTCCGTC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTCTTAATA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAACAA GTAACCACAT CTCAGGCGGC 1680 

AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGCGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACCAAAAG TGACTGGTGT CTCGTGTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAAATAATAT TTGAAGTCAT CTGTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATGTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

CTGCCCTGCC AAGGGAATTA ATGTTATCTT GTGAAAQGTG TTGCTGTTTG AATTGATGAG 2280 

AAATGGAAGA TGAGAACTCG CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTATACA GAGTTAAAGT GGAATGAGGT AAGAAGATAC AGCTACAGAA AATAGTTGCG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAATAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA GTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

AAATGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



SEQ |p NQ;2S7 PBM1 Pret^n sequence; 
PBM1 Protein sequence: CAB76901 

MGDPSKQDIL TIFKRLRSVP TNKVCFDCGA KNPSWASITY GVFLCIDCSG SHRSLGVHLS 60 
HRSTELDSN WSWFQLRCMQ VGGNAS ASSF FHQHGCSTND TNAKYNSRAA QLYREEOKSL 1 20 
ASQATRKHGT DLWLDSCWP PLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 180 
ETTLENNEGG QEQGPSVEGL NVPTKATLEV SSIIKKKPNQ AKKGLGAFCKG SLGAQKLANT 240 
CFNEIEKQAQ AADKMKEQED LAKWSKEES IVSSLRLAYK DLEIQMKKDE KMNISGKKNV 300 
DSDRLGMGFG NCRSVISHSV TSDMQTIEQE SPIMAKPRKK YNDDSDDSYF TSSSSYEDEP 360 
VELRSSSFSS WDDSSDS YWK KETSKDTETV UKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKAISSD MYPGRQSQAD YETRARLERL SASSSISSAD LFEEPRKQPA GNYSLSS VLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGWTSI QDRYGS 



MlffWWffftfPM4PNAWffiwnw 
Nucleic Add Accession!: 030891 

Coding sequence: 14032 (underlined sequence corresponds to start and stop codon) 

AIQGATACTG TCATGAAGCA GACAC ATGCT GACACACCTG TTGATCATTG TCTATCTGGC 60 
ATAAGAAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TCAACAAGCA TGAAACAGCC 120 
CTTGAAATGC AG AATOCAAA TTTGAACAAT AAAGAATGTT GTTTCACCTT TACGTTGAAT 180 
GGAAACTCCA GAAAATTAGA CCGTAGTOTG TTTACAGCAT ATGGTAAACC CAGCGAGAGT 240 
ATCTACTCAG CCCTG AGTGC TAATGACTAT TTCAGTGAAA GGATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AG AAAAGACA ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
CTCAAGTGCC TGCCTAGTG A TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAGAGTAGC 420 
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AAAGAAGATG GACACATATT ACGCCAATGT GAAAATCCAA ACATGGAATG CATTCTTTTT 480 
CATGTTGTTG CTATAGGAAG GACAAGAAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AAAGGAAGTA AACTTTGTAT TTATGCCTTG AAGGGTGAGA CTATTGAAGG AGCCTTATGC 600 
AACGATGGCC GTTTTCGGTC TGACATAGGT GAATTTGAAT CGAAACTAAA GGAAGGTCAT 660 
AAGAAAATTT ATGGAAAACA GTCCATGGTG GATGAAGTAT CTGGAAAAGT CTTAGAAATG 720 
GACATTTCAA AAAAA AAAGC ATTACAACAG AAAGATATCC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATG AAAT TAATCACCAG AGTCTOATAC AGTCTAAGAA AAAAGTCCAC 840 
AAACCAAAGA AAGATGGAGA GACCAAAGAT GTAGAACACA GCAGAGAGCA AATTCTCCCA 900 
CCTCAGG ATC TAAGCCATTA TATTAAAG AT A AAACTCGCC AG AC AATTCC CAGG ATTAG A 960 
AATTATTACT TTTGTAGTTT GCCCCG AAAA TATAGGCAAA TAAACTCACA AGTTAG ACGG 1020 
AGGCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATG TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAG A ATTATCAAAC GTTOAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1140 
GAGGAGGCAC AGTGGOTAAG AAAATATTTT CGGGAAGAAC AAAAGAGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAAAAG GACTTCGGAA AAATGACTGC AAATTCTGTT 1260 
TC AGTTGCAA CCTGCG AACA GCTTACATAT TATAGC AAGT CAGTTGGGTT CATGCAATGG 1320 
GACAATAATG GAAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1380 
ACCTGTCGAC ATGTTGTACA TCTTATGGTG GGTAA AAACA CACATCCAAG TTTGTGGCCA 1440 
GATATAATTA GCAAATGTGC GAAGGTAACC TTCACTTATA CAG AGTTCTG CCCTACTCCT 1500 
GACAATTGGT TTTCCATTGA GCCATGGCTT AAAGTGTCCA ATGAAAATCT AGATTATGCC 1560 
AnTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGCG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTGTGAT TCCTCTAAAC GAACG ATTGA AAAAATATCC AAACGATTGT 1740 
CAAG ATGGGT TGGTAG ATCT CT ATGATACC ACC AGTAATG TATACTGT AT GTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGTTTGGAACACA CACACGCTTA GTTATGATAC TTGTTTCTCT 1860 
GATGGGTCCT CAGGCTCCCC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTGCATACC 1920 
TTTGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGG ATTCTA TTCTTTGTGA TATTAAAAAG ACAAATG AG A GCTTGTATAA ATC ATTAAAT 2040 
GATG AGAAAC TTGAGACCTA CGATG AAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCG A 2100 
CTAGGATGCT TCOGCTTTCG CTCTCGCTTT CCAATACTCG GGACTGGGGA AACCGGGAGA 2160 
ATAG AAGCAG GCAAGG ACCG CCGTGGGC AC GCGGTCAGTG AG ACAGGGTC CTGCTCGCGG 2220 
CGTCAAGG AG G AGCGCTGTG GGTGTCCCC A GCGCAGCCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG CCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTGAGCG CTGOATTCCA 2340 
GGCCGAGTGC TGGCG AGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCT 2400 
CTAATGAGGA TGGAGTCTAG AGGAGACCCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 
TTCCATTCAC CT A AG AAAAA TCCAGAAGAC CAGACCATGC CCCAAAATAG GACAATATAT 2520 
GTTACCTTGA AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGCCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AACCTTGGAA TCCCCCTCAG TTGTTTCCCT 2640 
GAAGGTGGCC AGGTGGTCAT TACATTTTCC CAAAGTAAAA GTAAGCAGAA GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 
ATTGGAATTG GGAAGTGTAA AAGAAGGATT GTTAAATGTG GGAAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGGA G AAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GG AGAATG AT GATTGG AAAC TCATTG AAAA CAATG ACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAGAAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTGAGTCAGA GAAAAGAAAC 3060 
ACCTGTGTGT TGAGAGAACA AATCGTGGCT CAGTACCCCA GTTTGAAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGGAAAC ATTATTTGAA 3180 
TTGCATAGAA CAACGTTTGG GAAAGTAACA AAAAATTCTT CTTCGATTAA AGTAGTGAAA 3240 
CTTCTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAGAT 3360 
AGCATTGTGG GAG ACGG AAT AG AGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTG ACAT TTGGTTATG A AG AGCTAAAA G ACAAGGAAA CAAACTACTT TTTTGTTG AA 3480 
CCTTGGTTTG AG AT ACAT AA TGAAGAGCTT GACTATGCTG TCCTGAAACT GAAGGAAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TTGGCCATCC ATATGGAGAA AAAAAGCAOA TTGATGCTTG TGCTGTGATC 3660 
CCTCAGGGTC AGCG AGCAAA GAAATGTCAG GAACGTGTTC AGTCTAAAAA AGCAG AAAGT 3720 
CCAG AGTATG TCC ATATGTA TACTCAAAG A AGTTTCCAGA AAATAGTTCA CAACCCTGAT 3780 
GTG ATTACCT ATGACACTGA ATTTTTCTTT GGGGCTFCCG GCTCCCCTGT GTTTGATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
CGTAGTATCA TTOAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAG A 3960 
CATAAACCAT GGTATGAAGA AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
G AGG ACTTGXQAG A ATTCAG TCTACTGGAT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAGGCATTTT TCTAAGCACA TGAAG AAATT AGTCCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCA A ATM I 111 11 TTTTTGAGAC TGAGTCTCAC 4260 
TCTGTCGCCT GGGCTGGAGT ACAGTGGTGC GATCTCAGCT CACTGCAACT TCCACCTCCC 4320 
AGGTTCAAGC GATTCTTATG CCTCAGTCTC CTGAGCAGCT GGGATTACAG GCAAACGCCA 4380 
CCACACCCAG CT AAA 1 11 IT HUH 11 11 TGTATTnTA GTAGAGACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGG AT TACAAGTTTG AGCCACTGC A CCTGGCTAAC TTGCCCTATT TTAAAGTCAA 4560 
GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCT AAATT CATTTGCTAC AGTGCAGGAA 4680 
CCAAAACTTG TTCATCTCAT GATTCCCTAC ATCTGAC ATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTG A TTAGAAATGA TCTCAAAACC TTTTAGAATT TCCAAAATCT TCATATTACT 4860 
GAAACTGTCG GAATATATGG GTCCTG AAAT TCAGAAGATG ATAGT CACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTAA ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTCCAATTA TTAATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATGAAGA 5040 
CTATATAGAT GATTCAACCA AGCCTGCAAA TCTCCCTCTT GTGGAATTCC ACTGGACCCA 5100 
ATCTGTTTTC CATTTCCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACCCTCCCT 5160 
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CTAGGTCCAG GGACTATCAC AGAAGAAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGOTCG AGT GTAGG AAAAC AGCCTQTTGC ATTQTAAGAG TGATGTCACC TTGAAGAGCA 5280 
GCTGCCATG A TGACTGCTGT TTG ACTCCTG CATACCAAGA TATTCTGCAG CAATGTCTTT 5340 
AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAACC TCCCCAGTOT 5400 
TCAGGTGTTT CACAAG AAAG TCTGAGATAT GACTAGCTAC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGG AGGGTGA GTGCCGCCAT TTAGTGGCTG CTAG AAACAT 5520 
TGCTTCTGTT TGTAAGTTCC TATTAAATGT TCTTTCTGAG AAAAAAAAAA A 



10 gEQfDHftgWPPM4PrgWngwww« 
PBM4 Protein sequence: BAB67788 



MDTVMKQTHA DTPVDHCLSG IRKCSSTFKL KSEVNKHETA LEMQNPNLNN KECCFTFTLN 60 
GNSRKLDRS V FTAYGKPSES IYSALSANDY FSERIKNQFN KNIIVYEEKT HX3HINLGMP 120 
15 LKCLPSDSHF KITFGQRKSS KEDGHILRQC ENPNMECILF HWAIGRTRK KIVXINELHE 180 

KGSKLCIYAL KGETTEGALC KDGRFRSDIG EFEWKLKEGH KKIYGKQSMV DEVSGKVLEM 240 
DISKKKALQQ KDIHKKIKQN ESATDEINHQ SUQSKKKVH KPKKDGETKD VEHSREQILP 300 
PQDLSHY1KD KTRQTIPRIR NYYPCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
„ LLKNYQTLNE AIMHQYPNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DFGKMTANSV 420 
20 SVATCEQLTY YSKS VGFMQW DNNGNTGNAT CFVFNGGYIF TCRHWHLMV GKNTHPSLWP 480 
DnSKCAKVT FTYTEFCPTP DNWFSEPWL KVSNENLDYA ILKLKENGNA FPPGLWRQIS 540 
PQPSTGLIYL IGHPEGQ1KK IDGCTVIPLN ERLKKYFNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLSYDTCFS DGSSGSPVFN ASGKLVALHT FGLFYQRGFN VHALEEFGYS 660 
„ MDSHjCDIKK TNESLYKSLN DEKUETYDEE KARPRPAYRR LGCFRFRSRF PILGTGETGR 720 
25 ffiAGKDRRGH GVSETGSCSR RQGGALW VSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 
GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTNTQAQR FHSPKKNPED QTMPQNRTIY 840 
VTLKA VRKEI ETHQGQEMLV RGTEGIKEYI NLGMPLSCFP EGGQWTTFS QSKSKQKEDN 900 
HEFGRQDKAS TECVKFYIHA 1GIGKCKRRI VKCGKLHKKG RKLCVYAFKG ETDCDALCKD 960 
orv GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EKRMVPSAAA SQNPESEKRN 1020 
30 TCVLREQIVA QYPSLKRESE KUENFKKKM KVKNGETLFE LHRTTPGKVT KNSSSDCWK 1080 
LLVRLSDSVG YLFWDSATTG YATCFVFKGL FILTCRHVID SIVGDGIEPS KWATUGQCV 1 140 
RVTPGYEELK DKETNYFFVE PWFEIHNEEL DYAVLKLKEN GQQVPMELYN GITPVPLSGL 1200 
IHDGHPYGE KXQIDACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKJVHNPD 1260 
VTIYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSIIEFGSTM ESILLDKQR 1320 
3 5 HKPWYEEVFV NQQDVEMMSD EDL 

smiPWft^wPBQiPWAffpqwww 

Nucleic Acid Accession*: NM.015642 
40 Coding sequence: 489-2489 (underlined sequence corresponds to start and stop codon) 



45 



1 11 21 31 41 51 

t I I I I I 

ACATTTCAAA AAAAATACAT AGACTGATGT TTCAGACTTG TGCAGCATAA GCCTACAGGG 60 

TACGAAGAAT GAACTCTGAG AATGTTTGGA GAATGTTTCA TCATTACTAA CAGGATATTC 120 

CTGATGACAT TGCTGTCTGA TCTTTGACCA TCAGTCTGTG ACCTGCCCCT TCTCTTTACA 180 

TGCAGCCGCT CTCTGCTCCC TGCCCCAATG AACATCTGCA CTA6GCCCAA GCCTTGGAGT 240 

AATTTACCTG AAGAGTGACA CCATTGATTT TGAAACTACT GAAGAAACCC AAGACAGCTG 300 

50 AAAACCAGAA GGCATCTGAG GAGAATGAGA TTACTCAGCC GGGTGGATCC AGCGCCAAGC 360 

CGGGCCTTCC CTGCCTGAAC TTTGAAGCTG TTTTGTCTCC AGACCCAGCC CTCATCCACT 420 

CAACACATTC ACTGACAAAC TCTCACGCTC ACACOGGGTC ATCTGATTGT GACATCAGTT 480 

QCAAGGG GAT QACCGAGCGC ATTCACAGCA TCAACCTTCA CAACTTCAGC AATTOCGTGC 540 

TCGAGACCCT CAACGAGCAG CGCAACCGTG GCCACTTCTG TGACGTAACG GTGCGCATCC 600 

55 ACGGGAGCAT GCTGCGCGCA CACCGCTGCG TGCTGGCAGC CGGCAGCCCC TTCTTCCAGG 660 

ACAAACTGCT GCTTGGCTAC AGCGACATCG AGATCCCGTC GGTGGTGTCA GTGCAGTCAG 720 

TGCAAAAGCT CATTGACTTC ATGTACAGCG GCGTGCTACG GGTCTCGCAG TCGGAAGCTC 780 

TGCAGATCCT CACGGCCGCC AGCATCCTGC AGATCAAAAC AGTCATCGAC GAGTGCACGC 840 

- ft GCATCGTGTC ACAGAACGTG GGCGATGTGT TCCCGGGGAT CCAGGACTCG GGCCAGGACA 900 

60 CGCCGCGGGG CACTCCCGAG TCAGGCACGT CAGGCCAGAG CAGCGACACG GAGTCGGGCT 960 

ACCTGCAGAG CCACCCACAG CACAGCGTGG ACAGGATCTA CTCGGCACTC TACGOGTGCT 1020 

CCATGCAGAA TGGCAGCGGC GAGCGCTCTT TTTACAGCGG CGCAGTGGTC AGCCACCACG 1080 

AGACTGCGCT CGGCCTCCCC CGCGACCACC ACATGGAAGA CCCCAGCTGG ATCACACGCA 1140 

, TCCATGAGCG CTCGCAGCAG ATGGAGCGCT ACCTGTCCAC CACCCCCGAG ACCACGCACT 1200 

65 GCCGCAAGCA GCCCCGGCCT GTGCGCATCC AGACCCTAGT GGGCAACATC CACATCAAGC 1260 

AGGAGATGGA GGACGATTAC GACTACTACG GGCAGCAAAG GGTGCAGATC CTGGAACGCA 1320 

ACGAATCCGA GGAGTGCACG GAAGACACAG ACCAGGCCGA GGGCACCGAG AGTGAGCCCA 1380 

AAGGTGAAAG CTTCGACTCG GGCGTCAGCT CCTCCATAGG CACCGAGCCT GACTCGGTGQ 1440 

AGCAGCAGTT TGGGCCTGGG GCGGCGCGGG ACAGCCAGGC TGAACCCACC CAACCCGAGC 1500 

70 AGGCTGCAGA AGCCCCCGCT GAGGGTGGTC CGCAGACAAA CCAGCTAGAA ACAGGTGCTT 1560 

CCTCTCCGGA GAGAAGCAAT GAAGTGGAGA TGGACAGCAC TGTTATCACT GTCAGCAACA 1620 

GCTCCGACAA GAGCGTCCTA CAACAGCCTT CGGTCAACAC GTCCATCGGG CAGCCATTGC 1680 

CAAGTACCCA GCTCTACTTA CGCCAGACAG AAACCCTCAC CAGCAACCTG AGGATGCCTC 1740 

TGACCTTGAC CAGCAACACG CAGGTCATTG GCACAGCTGG CAACACCTAC CTGCCAGCCC 1800 

75 TCTTCACTAC CCAGCCCGCG GGCAGTGGCC CCAAGCCTTT CCTCTTCAGC CTGCCACAGC 1860 

CCCTGGCAGG CCAGCAGACC CAGTTTGTGA CAGTGTCCCA GCCCGGTCTG TCGACCTTTA 1920 

CTGCACAGCT GCCAGCGCCA CAGCCCCTGG CCTCATCCGC AGGCCACAGC ACAGCCAGTG 1980 

GGCAAGGCGA AAAAAAGCCT TATGAGTGCA CTCTCTGCAA CAAGACTTTC ACCGCCAAAC 2040 

on AGAACTACGT CAAGCACATG TTCGTACACA CAGGTGAGAA GCCCCACCAA TGCAGCATCT 2100 

80 GTTGGCGCTC CTTCTCCTTA AAGGATTACC TTATCAAGCA CATGGTOACA CACACAGGAG 2160 
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TQAGGGCATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

TGCACATGCG CCTCCACCOO GGAGAGAAGT CCTACQAOTO CTACATCTGC AAAAAQAAGT 2280 

TCTCTCACAA GACCCTCCTG GAGCGACACG TGGCCCTGCA CAGTGCCAGC AATGGGAGCC 2340 

CCCCTGCAGG CACACCCCCA GOTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

AGGGGACCAC TTACGTCTGG TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTGACG GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTATGGCAC TAGAATTTAA 2580 

GAAATGTTTT GGTTTCATTT TTACTTTCTG TTTTTGTTTT TGTTTCGTTT CATTTTGTAC 2640 

TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 
AAAAAAAAA 



$eq|dno;^i ppQtPr^nggqv™; 

PBQ1 Protein sequence: NP_056457 

MTERIHSINL HNFSNS VLET LNEQRNRGHF CDVTVRfflGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDIEIP SWSVQS VQK UDFMYSGVL RVSQSEALQI LTAASILQIK TVTOECTRIV 120 
SQNVGDVFPG IQDSGQDTPR GTPESGTSGQ SSDTESGYLQ SHPQHSVDRI YSALYACSMQ 180 
NGSGERSFYS GAWSHHETA LGLPRDHHME DPSWITRIHE RSQQMERYLS TTPETTHCRK 240 
QPRPVRIQTL VGNIHIKQEM EDDYDYYGQQ RVQUJERNES EECTEDTDQA EGTESEPKGB 300 
SFDSGVSSSI GTEPDSVEQQ FGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQLETGASSP 360 
ERSNEVEMDS TVITVSNSSD KS VLQQPS VN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTYLPAUFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPGLSTFTAQ 480 
LPAPQPLASS AGHSTASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHTGB KPHQCSICWR 540 
SFSLKDYLIK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKSYE CYICKKKFSH 600 
KTLLERHVAL HSASNGTPPA GTPPGARAGP PG WACTEGT TYVCSVCPAK FDQIEQFNDH 660 
MRMHVSDG 



SEQ ID NO: 262 P BQ8 ONA sequence 
Nucleic Acid Accession*: AI654187 

Cooing sequence: 1-812 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

ATGGTGGAAG AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

CGGAAAACTC CCTCACGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGATAA 



SEQ ID NO:263 P BQB Protein sequence: 
Protein Accession #: NP.060170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 
NQKLQEKMTP QGECS V AETL TPEEEHHMKR MMAKREKUK EUQTEKDYL NDLELCVREV 120 
VQPLRNKKTD RLDVDSLFSN IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKGPL 1 80 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:264 PBY7 DNA sequence 
Nucleic Add Accession!: NM.014323 

Coding sequence: 662-2725 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACOCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCG OCGCC GC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

OGCGGCGGAC CCCTCCTTCT CCTCCCCGCG TGCGCGTGCC CTTCTTGGCT GCGOGCCGGC 300 

GCCGCCTGGC GGGCGGGAGO GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACTACAGA 540 

GCCTCGGGCC GGCACGTGTG GGGAGTGTGG ACACGTCTGC TGCGCCCCGC TTCTCGCTGC 600 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



TGAGGGGAAG 
CATGGAGCGO 
CAGACACAGC 
CTGCGACGTO 
CGCCTGCAGC 
GGACGGGGGT 
CAGCCGGGAG 
CGCCTACACT 
CAAGTTCCTG 
CGTACAGATC 
CTCGGACTTG 
TGGCATCGCC 
TGCAGGCCAA 
ACCCCTATCC 
CCTGACTGGC 
TGGGTCCCCA 
GTTCACTGAT 
GCTGGGCTAC 
AGACCCCGAC 
CGGCAAGATC 
GAAGCCCTAC 
CCATGTGCGG 
AGGCTTCTCC 
GCCTCACAAG 
CCTGGCCTGT 
ATACATGGCA 
TAACCGAGGT 
TCCCCTTCCC 
CTGCGCCAGG 
GAGCTCTGAC 
GAGTGCCAAT 
TGGGGAGAAG 
GAACAAACAC 
CCCTGCCCTT 
GTTTCAGATT 
GCCCATGGGG 
GGGACTGCTG 
AGATTTTTAT 
TTCTCCCAAT 
ACTTGGTATG 
GTTTCTTTAA 
ATACCCAAAT 
TTCGTCATCC 
TACAATGCGG 
GAGTGTCCTC 
GAGCCCTGCT 
CCCACCCCAA 
TTCTCTAATT 
TGAAAGCTAT 
TATTAAACTT 
GGAAGAAATA 
TTTCAATGCT 
TCAGTTGTGT 
GACTGTATTA 



GGAGGGGGCG 
GTGAACGACG 
ACGGAGATGC 
CTCTTGCGGG 
GAGTACTTTG 
CCGGCTGATG 
CTGGAGATGC 
TCCCGCATCG 
CTGATGAGGT 
CTGGTACCCC 
GGCTTCCCTT 
GGCAGCATGC 
GCCTCTTTGC 
CCCCAACTGC 
AAGCGAGGCC 
GGGGGCCTGA 
GCCAACCGGC 
ATCGACCTTC 
GGCCCCCGAA 
TTCCGTGATG 
TCCTGCCCTG 
TCCCATGATG 
AGGCCTGATC 
TGTCAGACCT 
CATGAAGACA 
GACCACCTGA 
TTCTCCTCTG 
CAGGTCTCCA 
ACCTATGGCA 
TCCTATGGTG 
GGCTCTTTCT 
AAGTACCCAT 
ATCCAGAAGG 
GGCTCACCTT 
GTTCAGTCGG 
CCTGAAGGGA 
GGAAATGCTG 
TCATTTTTAA 
GGTCTTTAGA 
GGACAGGGGC 
TGGGAAGAAG 
CTATGATATT 
TCCCTTCCCA 
ATGCCCAACT 
CAAGAGCCCC 
TGGAGGCGAG 
ATTTCAGTTC 
ATTATTATTA 
CCCAGGTGAT 
TGTTTAGATG 
GTTTTATGCA 
GTTGGGAACC 
CACATGTGAG 
AAAATGTTAQ 



GGCAGGTGCA 
CTTCGTGCGG 
TGCACAACCT 
TAGGCGACGA 
AGTCGGTGTT 
TAGGGGGCGC 
ACACTATCAG 
TGGTGCGCTT 
CGGTTATCGA 
CTGCCCGCGC 
TGGACATGAC 
AGCCAGAGGA 
CTGTGTTACC 
TGACTTCCCC 
GGGGCCGCCC 
GGGAGGCAGG 
TCCGGCAGCA 
CTCCTCCGAG 
AGAGGAGCCG 
TGTATCATCT 
TGTGTGGGTT 
GGTCCGTGGG 
ACTTGAACGG 
GCAATGCTTC 
AGGTGCCCTG 
AGAAGCACAG 
CCTCCTACTT 
GGCACCAGGA 
ACAAAGAAGG 
ACCTCTCAGA 
CCTGCGACAT 
GCCCTGAATG 
TGCATGTCCG 
TCTCTCCTCA 
CATTTGCGTC 
AATGAGGCAG 
TGAATGCGGA 
CTGCCCCCCA 
AATAGATTTT 
AGAAAACACT 
CTGGAATTCC 
CTGGGACCTC 
TATCCTTCAA 
GTTTTTAAGG 
CTGAGCTCAG 
CATTTTCACT 
TTACGTGATT 
TTGTTATTAT 
ACAGAGCTCT 
TACCATAATT 
AAATTTTAAA 
AGGAAGGTGG 
CAAGCCCAGG 
TACATTACTC 



GCGGCCGGGC 
CCCGTCTGGC 
GAACCAGCAG 
GAGCTTCCCA 
CAGCGCCCAG 
GACGGCAGCA 
CTCCAAGGTA 
GGAGAGCTTT 
GATCTGCCAG 
CGATATAATG 
CAACGGGGCA 
GGAGGCAGCT 
TGGGGTGGAC 
ATTCCCCAGT 
AAGGAAGGCC 
CATCCTTCCA 
CGAGGCCCAG 
GCTGGGTGAG 
GACCAGGAAG 
TAACCGGCAC 
GCGGTTCAAG 
CAAGCCTTAC 
ACATATCAAG 
TTTTGCCACC 
CCAGGTGTGT 
CGAGGGGCCC 
AAAGGTCCAT 
GCCCATCCTG 
CCAGAAATGC 
TGCCAGCGAC 
GGCAGTCCCC 
TGGGAGCTTC 
GGCTCTCGGG 
GCAGAACATG 
ATCTTTAGTA 
CTGCTGTGTC 
GGGAAGTGAT 
ACCCCACTCC 
CATCTGATAT 
ACATAGGCCT 
TGGTGCTCAA 
AGTGATTTTG 
AAGAACCACA 
AAGCCAGAAG 
CCCTCTGCCT 
GCTAGGACAA 
TTAACCATTC 
TTTTTAGGAC 
TTGTAAACCG 
AACTTGGCTA 
AAATGCCAGT 
GACAGCCGGC 
TTGACCTTGT 
TA 



TAGTGGGAGG 
TGCTACACAT 
CGCAAAAACG 
GCGCACCGCG 
TTGGGCGACG 
CCAGGGGGCG 
TTTGGGGACA 
CCCGAACTCA 
GAAGTCATCA 
CTCTTTCGCC 
GCCTTGGCAG 
CGGGCGGCTG 
CGCTTGCCCA 
GTGGCATCCA 
AACCTGCTGG 
TGCGGTCTAT 
CACGGTGTCA 
AATGGGCTAC 
CAGGTGGCTT 
AAGCTGTCCC 
AGAAAAGACC 
ATCTGCCAGA 
CAGGTGCACA 
CGAGACCGTC 
GGGAAGTACT 
AGCAACTTCT 
GTTAAAACCC 
AATGGGGGAG 
TCACATCAGG 
CTGAAGACGC 
AAAAACAAAA 
TTCCGCTCTA 
GGCCCCCTGG 
TCTCTCCTCG 
GATCCTGAGG 
CCCACGGAAA 
GTTTGGGTTC 
AACTCCTTCT 
TCTGCAGAAA 
CCAAGGCAAA 
TTCTTAGTGA 
GTCCCCTCCC 
CTAGGGTCTC 
CATCCCATGG 
GGAGGGCTCC 
GCTCAGCTGT 
AACATGCTGT 
CAGTTGTAGT 
CAGTCACACA 
GTTGATTGTT 
CTGGTCAGGG 
AGGTAGGGAC 
GATGTGAATT 



GGGCGGCGGC 
ACCAGGTGAG 
GCGGGCGCTT 
CCGTGCTGGC 
GCGGAGCTGC 
GGGCCGGGGG 
TTCTGGACTT 
TGACGGCCGC 
AACAGTCCAA 
CCCCTGGGAC 
CCAACAGCAA 
GTGCAGCCAT 
TGGTGGCTGG 
GTGCCCCTCC 
ACTCAATGTT 
GTGGTAAGGT 
CCAGCCTCCA 
CCATCTCTGA 
GTGAGATCTG 
ACTCTGGGGA 
GCATGTCCTA 
GCTGTGGGAA 
CTTCTGAGCG 
TGCGCTCCCA 
TGCGGGCAGC 
GCAGTATCTG 
ACCACGGTGT 
CAGCGTTCCA 
ATCCGATTGA 
CAGAGAAGCA 
TGGAGTCTGA 
AGTCCTACTT 
GGGACCTGGG 
AGTCCTTTGG 
TTGACCAGCA 
CAACCATCTG 
TGTAGCTGAG 
CCACCACCCA 
TATCAATGAG 
ACCAGTCCCA 
CCCCAATCCT 
ACTTCTCTAG 
CACCTACTTA 
ACCATGGGGT 
AGACCTTTCT 
TGAGGACACC 
TGGGTTTTAA 
GAATTGCTAC 
TTAGGGTTAG 
TGAAGTCTAT 
AAGTAGGGGG 
ATTGTGTACC 
GATCTGATCA 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 



$EQ fD fifr2B$ PBY7 Prc^n gegucncg: 
Protein Accession*: NP.l 14439 

MERVNDASCG PSGCYTYQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFES VF S AQLGDGGAA DGGPADVGGA TAAPGGGAGG SRELEMHTIS SKVR3DILDF 120 
AYTSRIWRL ESFPELMTAA KFLLMRSVIE ICQEVIKQSN VQILVPPARA DIMLFRPPGT 180 
SDLGFPLDMT NGAALAANSN GIAGSMQPEE EAARAAGAAJ AGQASLPVLP GVDRLPMVAO 240 
PLSPQLLTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCGLCGKV 300 
FTDANRLRQH EAQHGVTSLQ LOYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACEIC 360 
GKIFRDVYHL NRHKLSHSGE KFYSCPVCGL RFKRKDRMS Y HVRSHDGSVG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKXYP CPECGSFFRS KSYLNKHIQK VHVRALGGPL GDLGPALGSP 600 
FSPQQNMSLL ESFGFQIVQS AFASSLVDPE VDQQPMGPEG K 



75 



80 



SEQ ID NO:266 PBYfl ONA sequence 
Nucleic Acid Accession!: NM.Q12429 

Cooing sequence: 174-1 385 (untierfined sequence corresponds to start and stop codon) 

1 n 21 31 41 51 

I I I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 
GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 
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TGCCGCACCC OCCGCCTCCC GOCCCCAAAC OOCATCCCCG CGGTTGAGCC ACGATGAGCG 180 

GCAGAGTCGG CGATCTGAGC CCCAGGCAGA AGGAGGCATT GGCCAAGTTT CGGGAGAATG 240 

TCCAGGATGT GCTGCCGGCC CTGCCGAATC CAGATGACTA TTTTCTCCTG CGTTGGCTCC 300 

GAGCCAGAAG CTTCGACCTG CAGAAGTCGG AGGCCATGCT CCGGAAGCAT GTGGAGTTCC 360 

GAAAGCAAAA GGACATTGAC AACATCATTA GCTGGCAGCC TCCAGAGGTG ATCCAACAGT 420 

ATCTGTCAGG GGGTATGTGT GGCTATGACC TGGATGGCTG CCCAGTCTGG TACGACATAA 480 

TTGGACCTCT GGATGCCAAG GGTCTGCTGT TCTCAGCCTC CAAACAGGAC CTGCTGAGGA 540 

CCAAGATGCG GGAGTGTGAG CTGCTTCTGC AAGAGTGTGC CCACCAGACC ACAAAGTTGG 600 

GGAGGAAGGT GGAGACCATC ACCATAATTT ATGACTGCGA GGGGCTTGGC CTCAAGCATC 660 

TCTGGAAGCC TGCTGTGGAG GCCTATGGAG AGTTTCTCTG CATGTTTGAG GAAAATTATC 720 

CCGAAACACT GAAGCGTCTT TTTGTTGTTA AAGCCCCCAA ACTGTTTCCT GTGGCCTATA 780 

ACCTCATCAA ACCCTTCCTG AGTGAGGACA CTCGTAAGAA GATCATGGTC CTGGGAGCAA 840 

ATTGGAAGGA GGTTTTACTG AAACATATCA GCCCTGACCA GGTGCCTGTG GAGTATGGGG 900 

GCACCATGAC TGACCCTGAT GGAAACCCCA AGTGCAAATC CAAGATCAAC TACGGGGGTG 960 

ACATCCCCAG GAAGTATTAT GTGCGAGACC AGGTGAAACA GCAGTATGAA CACAGCGTGC 1020 

AGATTTCCCG TGGCTCCTCC CACCAAGTGG AGTATGAGAT CCTCTTCCCT GGCTGTGTCC 1080 

TCAGGTGGCA GTTTATGTCA GATGGAGCGG ATGTTGGTTT TGGGATTTTC CTGAAGACCA 1140 

AGATGGGAGA GAGGCAGCGG GCAGGGGAGA TGACAGAGGT GCTGCCCAAC CAGAGGTACA 1200 

ACTCCCACCT GGTCCCTGAA GATGGGACCC TCACCTGCAG TGATCCTGGC ATCTATGTCC 1260 

TGCGGTTTGA CAACACCTAC AGCTTCATTC ATGCCAAGAA GGTCAATTTC ACTGTGGAGG 1320 

TCCTGCTTCC AGACAAAGCC TCAGAAGAGA AGATGAAACA GCTGGGGGCA GGCACCCCGA 1380 

AATAACACCT TCTCCTATAG CAGGCCTGGC CCCCTCAGTG TCTCCCTGTC AATTTCTACC 1440 

CCTTGTAGCA GTCATTTTCG CACAACCCTG AAGCCCAAAG AAACTGGGCT GGAGGACAGA 1500 

CCTCAGGAGC TTTCATTTCA GTTAGGCAGA GGAAGAGCGA CTGCAGTGGG TCTCCGTGTC 1560 

TATCAAATAC CTAAGGAGTC CCCAGGAGCT GGCTGGCCAT CGTGATAGGA TCTGTCTGTC 1620 

CTGTAAACTG TGCCAACTTC ACCTGTCCAG GGACAGCGAA GCTGGGGGTG GCGGGGGGCA 1680 

TGTACCACAG GGTGGCAGCA GGGAAAAAAA TTAGAAAAGG GTGAAAGATT GGGACTTAAC 1740 

ACTTCAGGGA AGTCAGCTGC CGGGGAGAAA CTTGCTCCTA AATGAACACA TAAGTTTAGA 1800 

TCGCAATGAG GAGTAGCAGG GTAGCTGGTT GCTAGAGTTA CGGTGGGGAT CAGAAACTCT 1860 

TCCAAACATT TTAGCACTGA GGCTGGGGTA GCTTTTGGCT TTTCCCAGGT CTCAGGAGGT 1920 

GGCCTGAGTC AGCACACATC TTCCCACTCG GTAGACAGGC TGGCCTCTCC CTCACTTTGA 1980 

GACTTTGGCA ACTCCTGGGC CACACGGCCT GCCTCTTTGA TTACTAATGA TTGTCAGTGA 2040 

CTCAGAGCTT CCTGGGACTT CGGGTACCCA CCCGCTGTTC TCCATGCAAA CAAAGCGCCA 2100 

GGGAAATGAC CCACAGGGAT CGCAGCTGCA GGGAGGGCCA GGGAGGTTGG GGGTGGGAGT 2160 

GAATGCTAAA AGCAGATCGT CCAGTGCCCT TTTCAGTGCT ACCGGCCTCT CACCAAGCAG 2220 

TCCTCCATGT GAGCAACCCC GAGACAAAAA TGCTAAGTGG GATCAAGAGA GCAGCACTCG 2280 

GAGAGGGTGT TTGCCAGTCT GAGTGTCCCG CGGTGCCCGC CAACCCGCTT CCTGACTGAC 2340 

CTGAGCAAGG TCTTACTAAG CAGTCCCATC TCTGTGGGAG GCATGCAACG CGTGCAGGGA 2400 

GTTCAGGTGC CGGTCGGCGT AGCCAGGCCT GGAGGCCCCC CAGGCAGGAG GCCGCCCAAA 2460 

GGCGGGGCCG GCGTCTCGCA GACTAGGGGC TGGGGGCGGC CACAGACGGC CTCGAAACCA 2520 

CAGCCCTTAC CCCAATCCCA CGAGCCCCGC CAACGAACCA CAGGTGCTGG GCTTTAGAGA 2580 

ACATGGGAAG GOGGCCCCAG ACCTGGCGGG AACGCCTTTC CCTCAGAGCC AGGCCCCGGC 2640 

CCCGTCTGGG AAGCTCATCT TGCGAAGCTG AGGGAGCTCA GGGCAAAGGC CAGGCTAGCG 2700 

CGGACCGGAA GGGGCCGAGG CTGCACGGGC CTCTGCCAGA ACGCTCAGGA CATCCCGGCC 2760 
TGGGTTTACA ACGCTGTTAG GAAAATTAAC CAATGAATAA AGCAACGTTC AGTGCGCA 



SEP ID NO:267 PBY9 Protein sequence: 
Protein Acce«sion#: NP_03656l 

MSGRVGDLSP RQKEALAKFR ENVQDVLPAL PNPDDYFLUl WLRARSFDLQ KSEAMLRKHV 60 
EFRKQKDIDN HSWQPPEVI QQYLSGGMCG YDLDGCPVWY DIIGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETIT UYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF WKAPKLFPV AYNL1KPFLS H5TRKKIMVL GANWKEVLLK HISPDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RPQVKQQYEH SVQISRGSSH QVEYEILFPG 300 
CVLRWQFMSD GADVGPGIFL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS FIHAKKVNFT VEVLLPDKAS EEKMKQLGAG TPK 



SEQ in NO:268 PBHB DMA SMuence 
Kudete Acid Accession*: XM.009756 

Coding sequence: 301-1440 (underlined sequence corresponds to start and stop codon) 
1 11 21 31 41 51 

I I I I I i 

GTGGGGACAG CCGAGCCGCG CCGGGCCCCT GGACGGCGTC GCCAAGGAGC TGGGATCGCA 60 

CTTGCTGCAG ACTTTGGATG GATTTGTTTT TGTGGTAGCA TCTGATGGCA AAATCATGTA 120 

TATATCCGAG ACCGCTTCTG TCCATTTAGG CTTATCCCAG GTGGAGCTCA CGGGCAACAG 180 

TATTTATGAA TACATCCATC CTTCTGACCA CGATGAGATG ACCGCTGTCC TCACGGCCCA 240 

CCAGCCGCTG CACCACCACC TGCTCCAAGG TATGAGATAG AGAGGTCGTT CTTTCTTCGA 300 

ATGAAATGTG TCTTGGCGAA AAGGAACGCG GGCCTGACCT GCAGCGGATA CAAGGTCATC 360 

CACTGCAGTG GCTACTTGAA GATCAGGCAG TATATGCTGG ACATGTCCCT GTACGACTCC 420 

TGCTACCAGA TTGTGGGGCT GGTGGCCGTG GGCCAGTCGC TGCCACCCAG TGCCATCACC 480 

GAGATCAAGC TGTACAGTAA CATGTTCATG TTCAGGGCCA GCCTTGACCT GAAGCTGATA 540 
TTCCTGGATT CCAGGGTGAC CGAGGTGACG GGGTACGAGC CGCAGGACCT GATCGAGAAG s 600 

ACCCTATACC ATCACGTGCA CGGCTGCGAC GTGTTCCACC TCCGCTACGC ACACCACCTC 660 

CTGTTGGTGA AGGGCCAGGT CACCACCAAG TACTACCGGC TGCTGTCCAA GCGGGGCGGC 720 

TGGGTGTGGG TGCAGAGCTA CGOCACCGTG GTGCACAACA GCCGCTCGTC CCGGCCCCAC 780 

TGCATCGTGA GTGTCAATTA TGTACTCACG GAGATTGAAT ACAAGGAACT TCAGCTGTCC 840 

CTGGAGCAGG TGTCCACTGC CAAGTCCCAG GACTCCTGGA GGACCGCCTP GTCTACCTCA 900 
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CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA TGAAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGO ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACG CCATCCTACA GCCTGCCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAGCG GTGAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACGAAGGCA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAGGA CGCAGACTGA 1440 
CTCCTGTTTG CTCGCTGGAC CAAC 



SEQ fD MO:269 PBH8 Protein seouence: 
Protein Accession #: NPJJ05060 



MKEKSKNAAK TRREKENGEF YELAKLLPLP SAJTSQLDKA SORLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFWASDGK IMY1SETASV HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VA VGQSLPPS AJTEIKLYSN MFMFRASLDL 240 

20 KLIFLDSRVT EVTGYEPQDL EKTLYHHVH GCDVFHLRYA HHUXVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATWHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSPQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRFGEDTAP 540 

25 PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATG ALRLRH PSPAATSPPG APLPHYLGAS 660 
VHTNGR 

30 SEQ ID KO:270PBJ9DNA seouence: 

Nucleic Acid Accession^: AA760894 



GGCACGAGGA GAAGATGTGG CTTGCTCATG CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
CCAGCCATGT GGAACTGTTT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 
GGAAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTGATGGAT CTCTGCAGTA AGTGGAAGAG TTCTTCATGG CCCCCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGG AA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTGAAGAT GAGATCATAC TGGATTAGGA TGGGATCTAA ATCCAATGAA AATGTCTTCA 360 
Ar . TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGG AGGCA 420 
40 GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGGAAG 480 
GAAATGAGGG ATTCTCTCCT AGAACCTTTA GAGAGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTGGACTTG CCCATAGCTT GTATACTCTT ACTTTGGATA CAATTTTATC CAAACTTGGC 600 
TAAACAGTTT CTCAGCCTAT GGAAAATTTA AAATGGAGAA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GGAGGAGACC AAGAAAGCCT ATAAAAAGCA 720 
45 AAAATATGAA GTGAACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AACCCCCAAG GGGAATCCCC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTGATCATY TAG AGATGTA CAO AAAAGGT GAATCTGTGT TCTGTATATT CTGCCTAAGG 900 
CAAAGAAATG TTTAGCTVTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTCCCAT ATCTGGAGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 
50 ATCTTCTTAC TTGGACATTT CATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 

TATAGCTGCT AACACTTCCC GCAGAGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1 140 
TTGATTTGAA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT GA 

„ SEQ!DNO:271 PBQ4 DNA sequence 
55 Nucleic Acid Accession* AA149579 

Coding sequence: 1-1363 (undefined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I 1 

ATGG AATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATG GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT OCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

65 CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATOTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCCGCC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

70 ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

_ _ CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

75 TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CETTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTITTA TTCAGTCTAC ACTTGGATAT 1200 
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GTCGCTCTGC TCATAAOTAC TTTCCATGTT TTAATTTATG GATGGAAACQ AGCTTTTGAG 1260 

GAAGAGTACT ACAGATTTTA TACACCACCA AACTITGTTC TTGCTCTTGT TTTGCCCTCA 1320 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG ACTGA 

5 g^q IP Nfrffil F6Q4 PrcM geqiwe; 

Protein Accession #: none 

lrt 1 11 21 31 41 51 

10 | | | | | | 

MESISMMGSP KSLSETCLPN GINGIKDARK VTVGVIGSGD FAKSLTIRLI RCGYHWIGS 60 

RNPKPASEPF PHWDVTHHE DALTKTWIIF VAIKREHYTS LWDLRHLLVG KILIDVSNNM 120 

RINQYPESNA EYLASLFPDS LIVKGFNWS AWALQLGPKD ASRQVYICSN NIQARQQVIE 180 

LARQLNFIPI DLGSLSSARE IENLPLRLFT LWRGPVWAI SLATFFFLYS FVRDVIHPYA 240 

15 RNQQSDFYKI PIEIVNKTLP IVAITLLSLV YLAGLLAAAY QLYYGTKYRR PPPWLETWLQ 300 

CRKQLGLLSF FFAMVHVAYS LCLPMRRSER YLFLNMAYQQ VHANIENSWN EEEVWRIEMY 360 

ISFGIKSLGL LSLLAVTSIP SVSNALNWRE FSFIQSTLGY VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVILDLLQLC RYPD 

20 SEQ 10 NO:Z73 PBQ5 DNA SEQUENCE 

Nucleic Add Accession*: NM.001973 

Coding sequence: 150-1 445 (underlined sequence corresponds lo start and stop codon) 

„ 1 11 21 31 41 51 

25 | | | | | | 

CCGCCGCCTT CTACTCCGCC GCGGGGGTCG CAGCGGCTGC CGCGCCGTCC TCGAGTTTCC 60 

AGCGTGAGGA GGAGGCTGAG GGCGGAGAGG CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

GAGCCCCGCG CGCGGCGTCG CTCATTGCTA TGG ACAGTGC TATCACCCTG TGGCAGTTCC 180 

TTCTTCAGCT CCTGCAGAAG CCTCAGAACA AGCACATGAT CTGTTGGACC TCTAATGATG 240 

30 GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG TGGCTCGTCT CTGGGGGATT CGCAAGAACA 300 

AGCCTAACAT GAATTATGAC AAACTCAGCC GAGCCCTCAG ATACTATTAT GTAAAGAATA 360 

TCATCAAAAA AGTGAATGGT CAGAAGTTTG TGTACAAGTT TGTCTCTTAT CCAGAGATTT 420 

TGAACATGGA TCCAATGACA GTGGGCAGGA TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

GTGAAGTCAG CAGCAGTTCC AAAGATGTGG AGAATGGAGG GAAAGATAAA CCACCTCAGC 540 

35 CTGGTGCCAA GACCTCTAGC CGCAATGACT ACATACACTC TGGCTTATAT TCTTCATTTA 600 

CTCTCAACTC TTTGAACTCC TCCAATGTAA AGCTTTTCAA ATTGATAAAG ACTGAGAATC 660 

CAGCCGAGAA ACTGGCAGAG AAAAAATCTC CTCAGGAGCC CACACCATCT GTCATCAAAT 720 

TTGTCACGAC ACCTTCCAAA AAGCCACCAG TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

A _ GCCCAAGTAT TTCTCCATCT TCAGAAGAAA CTATCCAAGC TTTGGAGACA TTGGTTTCCC '840 

40 CAAAACTGCC TTCCCTGGAA GCCCCAACCT CTGCCTCTAA CGTAATGACT GCTTTTGCCA 900 

CCACACCACC CATTTCGTCC ATACCCCCTT TGCAGGAACC TCCCAGAACA CCTTCACCAC 960 

CACTGAGTTC TCACCCAGAC ATCGACACAG ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AACTTCCAGA GAATTTGTCT CTGGAGCCTA AAGACCAGGA TTCAGTCTTG CTAGAAAAGG 1080 

ACAAAGTAAA TAATTCATCA AGATCCAAGA AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

45 TTGTGATCAC GAGCAGTGAT CCAAGCCCAC TGGGAATACT GAGCCCATCT CTCCCTACAG 1200 

CTTCTCTTAC ACCAGCATTT TTTTCACAGA CACCCATCAT ACTGACTCCA AGCCCCTTGC 1260 

TCTCCAGTAT CCACTTCTGG AGTACTCTCA GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

TGCAAGGTGC TAACACACTT TTCCAGTTTC CTTCTGTACT GAACAGTCAT GGGCCATTCA 1380 

c CTCTGTCTGG GCTGGATGGA CCTTCCACCC CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

50 CATAACCTAT GCACTTGTGG AATGAGAGAA CCGAGGAACG AAGAAACAGA CATTCAACAT 1500 

GATTGCATTT GAAGTGAGCA ATTGATAGTT CTACAATGCT GATAATAGAC TATTGTGATT 1560 

TTTGCCATTC CCCATTGAAA ACATCTTTTT AGGATTCTCT TTGAATAGGA CTCAAGTTGG 1620 

ACTATATGTA TAAAAATGCC TTAATTGGAG TCTAAACTCC ACCTCCCTCT GTCTTTTCCT 1680 

TTTCTTTTTC TTTCCTTCCT TCCTTTTCTT TTCTCCTTTA AAAATATTTT GAGCTTTGTG 1740 

CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

TTACTCCTTC TGGCTATTGG GACCCTTTGG CCAGGAAAAA TTATGCTTAG AATCTATTAT 1860 

TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AAA 
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SK) |D W:274 pbqs P rotein s equ enc e; 
Protein Accession «: NP.0019&4 



_ MDSAJTLWQF LLQLLQKPQN KHM1CWTSND GQFKLLQAEE VARLWGIRKN KPNMNYDKLS 60 
65 RALRYYYVKN IIKKVNOQKF VYKFVSYPEI LNMDPMTVGR IEGDCESLNF SEVSSSSKDV 120 
ENGGKDKPPQ PGAKTSSRND YIHSGLYSSF TLNSLNSSNV KLFKL1KTEN PAEKLAEKKS 180 
PQEPTPSV1K FVTTPSKKPP VEPVAAT1SI GPSISPSSEE TIQALETLVS PKLPS LEAPT 240 
SASNVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDSVASQPM ELPENLSLEP 300 
„ KDQDSVIXEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LGILSPSLPT ASLTPAPFSQ 360 
70 TPULTPSPL LSSIHFWSTL SPVAPLSPAR LQGANTLFQF PSVLNSHGPF TLSGLDGPST 420 
PGPFSPDLQK T 

SEQ ID NO:275 PBY3 DNA SEQUENCE 

75 Nucleic Add Accession*: AB040921 

Coding sequence: 131-2560 (underlined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

t I I I I I 
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AATCAGGAAC 
AGATGGAACT 
GTATATTGAA 
GGTAAATTTA 
AACCACTCAA 
TTGCAGAATA 
AGCTGCAGAA 
CCAQAGTCGG 
TCAGTGGCTC 
CCATGAAAGA 
TCGATCTGAC 
ATATTTTGGT 
TCTTTTGGAA 
CCAGTTTAAG 
AGCAATATAT 
AAGTACT6TA 
TOCCCTCATC 
AG6CTGGGAC 
AGATAAATTT 
GTTTAAAAGA 
TAGCATTACC 
TTTTGATACT 
CAAACAGAGA 
TGGTCTTAGA 
GGAAGAACTT 
TAGATTAATG 
GCTGAACGCT 
ACCCGTTGAG 
CCCAGTACTC 
AAAAGAAAAG 
CTTAACAGTT 
C6AAAA6GAC 
CATGAAAGGA 
TAAAGATCCA 
TGCTGGTTTA 
GGTAAAAGTT 
GGAGCAAACA 
TATATACTTG 
CATTTCCATC 
TCAGTCTCCA 
TCTGCAAGAG 
CTGTGCAGTA 
GAACTTTCCG 
GAAAAGCCAG 
CTGGGACATG 
ATGTGCATGA 
AATATGTTCT 
CTTTATATAT 
AATCTTTCTG 
ACAAGTGTCA 
AGTAAATTAA 



AGATCATATA 
TTAGACCAAA 
ATGC AGCATT 
ATTGATAACC 
G TTAC TCAGT 
GTTTGTACTC 
AGGGCAGAAT 
TTGCCAAGGA 
CAGTCAGACC 
AATCTGCAGT 
TTGAAAGTAA 
AACTGTCCAA 
GATGTAATTG 
AGGGGTTTCA 
AAAGAACGTT 
GATGTTATAG 
CGATACATTG 
AATATCAGCA 
TTAATTATAC 
ACCCCTCCTG 
ATAGATGATG 
CAGAACAATA 
AAAGGTCGAG 
GCAAGTCTTC 
TGTTTACAAA 
GACCCACCAT 
TTGGATAAAC 
CCACATATTG 
ACTATTGCTG 
ATTGCAGATG 
GTGAATGCGT 
TATTGCTGGG 
CAGTTTGCTG 
GAATCTAATA 
TATCCCAAAG 
TACACAAAAA 
GACTTTCACT 
TATGACTGCA 
CAGAAGGATA 
GCAAGAATTO 
AAGATTGAAA 
CTGTCAGCTA 
CCACGATTCC 
TTTGACAGCC 
AACAATTTTC 
CTTGATGTTA 
CTGATCATAT 
ATTGAGTATT 
CTCATAATGA 
ATTAAGAATT 
TTTGTTGTAA 



TTGACCGAGA 
AATTATTGGA 
TCAGAGAAAA 
ATCAGGTAAC 
TCATTTTGGA 
AGCCAAGAAG 
CTTGTGGCAG 
AACAGGGTTC 
CGTATTTGTC 
CAGATGTTTT 
TATTGATGAG 
TGATACATAT 
AAAAAATAAG 
TGCAAGGGCA 
GGCCAGATTA 
AAATGATGGA 
TTTTGGAAGA 
CTTTACATGA 
CTTTACATTC 
GTGTTCGGAA 
TCGTTTATGT 
TCAGTACAAT 
CTGGAAGAGT 
TAGATGACTA 
TAAAGATTTT 
CAAATGAGGC 
AAGAAGAATT 
GAAAAATGAT 
CTAGTCTCAG 
CAAGAAGAAA 
TTGAGGGCTG 
AATATTTTCT 
AGCATCTTCT 
TAAATTCAGA 
TTGCTAAAAT 
CCGATGGCCT 
ACAACTGGCT 
CAGAGGTTTC 
ACGATCAGGA 
CCCATCTTGT 
GTCCTCATCC 
TTATAGACTT 
AGGATGGATA 
ATTCTTCATC 
ATGTGTAAGO 
TATGTAGAGA 
ACTCTGCTGT 
GTACCACTTG 
TTGATGATAC 
TGAACACAAC 
TAAAGTCCAG 



TTCTGAGTAT 
AGATTTACAA 
GCTGCCTTCG 
AGTAATAAGT 
TAACTACATT 
AATTAGTGCC 
TGGTAATAGT 
TATCTTATAC 
CAGTGTTAGT 
AATGACTGTT 
TGCAACATTG 
ACCTGGTTTT 
GTATGTTCCA 
TGTAAATAGA 
TGTAAGGGAA 
GGATGATAAA 
AGAGGATGGT 
TCTCTTGATG 
ACTGATGCCT 
AATAGTAATT 
GATAGATGGA 
GTCCGCTGAO 
TCAACCTGGT 
TCAACTGCCA 
AAGGCTAGGT 
AGTGTTACTC 
GACACCTCTT 
TCTTTTTGGA 
TTTCAAAGAT 
GGAATTGGCA 
GGAAGAGGCT 
GTCTTCAAAC 
TGGAGCTGGA 
TAATGAGAAG 
TCGACTAAAT 
GGTTGCTGTT 
TATCTATCAC 
CCCATACTGT 
AACTATTGCT 
TAAGGAATTA 
TGTAGACTGG 
GATCAAAACA 
TTACAGCTGA 
ATTGTTTAAA 
TAGAAGCCTT 
TATATATATA 
GGTCATGCCC 
AGAAATTCCT 
CACCAGTAAA 
CACATTTTTT 
TATTTAATAA 



CTCTTGCAAG 
AAGAAAAAAA 
TATGGAATGC 
GGTGAAACTG 
GAAAGAGGAA 
ATTTCAGTTG 
ACTGGATATC 
TGTACAACAG 
CATATCGTAC 
GTTAAAGACC 
AATGCAGAAA 
ACCTTTCCGG 
GAACAAAAAG 
CAAGAAAAAG 
CTGCGAAGAA 
GTTGATCTGA 
GCGATACTGG 
TCACAAGTAA 
ACAGTTAACC 
GCTACCAACA 
GGAAAAATAA 
TGGGTTAGTA 
CATTGCTATC 
GAAATTTTGA 
GGAATTGCTT 
TCCATAAGAC 
GGAGTCCACT 
GCACTGTTCT 
CCATTTGTCA 
AAGGATACTA 
AGGCGACGTO 
ACACTGCAGA 
TTTGTAAGCA 
ATAATTAAAG 
TTGGGTAAAA 
CATCCTAAAT 
CTAAAGATGA 
CTCTTGTTTT 
GTAGATGAGT 
AGAAAGGAAC 
AATGACACTA 
CAGGAAAAGG 
CAGCTTTTCA 
TTTTGGCTGG 
CAGTAGGTAG 
TATATATATA 
ACTCTTTGGG 
TTGTTCTGTT 
AATAGGATGT 
AAAATGAAAC 
AATGTACAAT 



AAAATGAACC 
ATGACCTTCG 
AAAAGGAATT 
GTTGTGGCAA 
AAGGATCTGC 
CGGAAAGAGT 
AAATTCGTCT 
GAATCATCCT 
TTGATGAAAT 
TTCTCAATTT 
AGTTTTCAGA 
TTGTGGAATA 
AACACAGATC 
AAGAAAAAGA 
GGTATTCTGC 
ATTTGATTGT 
TCTTTCTOCC 
TGTTTAAATC 
AGACACAGGT 
TTGCGGAGAC 
AAGAGACGCA 
AAGCTAATGC 
ATCTGTATAA 
GAACTCCTTT 
ATTTTCTGAG 
ACCTGATGGA 
TGGCACGATT 
GCTGCTTAGA 
TTCCACTGGG 
GAAGTGATCA 
GTTTCAGATA 
TGCTGCATAA 
GTAGAAATCC 
CTGTCATCTG 
AAAGAAAAAT 
CTGTTAATGT 
GAACAAGCAG 
TTGGAGGTGA 
GGATTGTATT 
TAGATATTCT 
AATCCAGAGA 
CAACTCCCAG 
GGGGTGGTCT 
ATGCCAAACC 
TAAAGACTTA 
CCATAAAAGC 
AGTATATTCC 
ATACAAAATT 
TTACCOCAAA 
TTCTATCGGA 
GTTAAATCTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



SEP ID MO:276 PBY3 Protein sequence: 
Protein Accession*: BAA96012 

IRNRSYIDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL 60 
VNUDNHQVT VISGETGCGK TTQVTQFILD NYIERGKGSA CRIVCTQPRR ISAISVAERV 120 
AAERAESCGS GNSTGYQIRL QSRLPRKQGS ILYCTTGIIL QWLQSDPYLS SVSHIVLDEI 180 
HERNLQSDVL MTWKDLLNF RSDLKVILMS ATLNAEKFSE YFONCPMIHI PGFTFPVVEY 240 
IXEDVIEKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AIYKERWPDY VRELRRRYS A 300 
STVDVIEMME DDKVDLNUV ALIRYIVLEE EDGAJLVFLP GWDN1STLHD LLMSQVMFKS 360 
DKFLIIPLHS LMPTVNQTQV FKRTPPGVRK IVIATNIAET SITIDDWYV IDGGKIKETH 420 
FDTQNNISTM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPHLRTPL 480 
EELCLQIK1L RLGG1AYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPHIGKMI LFGALFOCLD PVLTIAASLS FKDPFVIPLG KEKIADARRK ELAKDTRSDH 600 
LTWNAFEGW EEARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEHLL G AGFVSSRNP 660 
KDPESNINSD NEKIIKAVIC AGLYPKVAK1 RLNLGKKRKM VKVYTKTDGL VAVHPKSVNV 720 
EOTDFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQB TIAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKIESPHP VDWNDTKSRD CAVLSAIIDL IKTQEKATPR 840 
NFPPRFQDGY YS 



SEQ ID N&277 PBY6 DNA SEQUENCE 

NuctecActd Accession*: AA464018 
75 Coding sequence: 64-1669(under1lne<J sequence corosponds to start and stop codon) 



GATTTTATCC TGG AACATTA CAGTGAAGAT GGCTATTTAT ATGAAGATGA AATTGCAGAT 60 
CTTATGGATC TO AGACAAGC TTGTCGGACG CCTAGCCGGG ATGAGGCCGO GGTGGAACTG 120 
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CTGATGACAT ACTTCATCCA GCTGGQCTTT GTCGAGAGTC GATTCITCCC GCCCACACGG 180 
CAGATGGGAC TCCTGTTCAC CTGGTATGAC TCTCTCACCG GGGTTOCGGT CAGCCAGCAO 240 
AACCTGCTGC TGGAGAAGGC CAGTGTCCTG TTCAACACTG GGGCCCTCTA CACCCAGATT 300 
- GGG ACCCGGT GTGATCGGCA GACGCAGGCT GGGCTGGAGA GTGCCATAGA TGCCTTTCAG 360 
5 AGAGCCGCAG GGGTTTTA AA TTACCTG AA A G ACACATTTA CCCATACTCC AAGTTACGAC 420 
ATGAGCCCTG OCATGCTCAG CGTGCTCGTC A AAATG ATGC TTGCACAAGC CCAAGA AAGC 480 
GTGTTTGAGA AAATCAGCCT TCCTGGG ATC CGGAATOAAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGG AGG CTGCTAAGGT GGG AGAGGTC TACC AACAGC TACACGCAGC CATGAGCCAG 600 
4 . GCGCCGGTGA AAGAGAACAT CCCCTACICC TGGGCC AGCT TAGCCTGCGT GAAGGCCCAC 660 
10 CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTG AAG 720 
CCAGGCACGG ATCTGGACCA CCAGGAGAAG TGCCTGTCCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCTTGGC CACACTGAAG AATGATCAGC AGCGCCGACA GCTGGGGAAG 840 
TOCCACTTGC GCAGAGCCAT GGCTCATCAC G AGGAGTCGG TGCGGGAGGC CAGCCTCTGC 900 
, „ AAGAAGCTGC GGAGCATTGA GGTGCTAC AG AAGGTGCTGT GTGCCGCACA GGAACGCTCC 960 
15 CGGCTCACGT ACGCCCAGCA CCAGGAGGAG GATGACCTGC TGAACCTGAT CGACGCCCCC 1020 
AGTGTTGTTG CTAAAACTG A GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA GAAGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGCGG 1 140 
TGGACGCCTC CTCGAAGCAT CCGCTTCACT GCAGAAGAAG GGGACTTGGG GTTCACCTTG 1200 
AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 
20 GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTG TGGATTGTAA GTGGCTGACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACGAGATCGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATCCATG CATAATAAGA GTGCCACATA CTCCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GCCATTGATG ATGACGACAA AACTGATAAA 1500 
_ ACCAAGAAAA TCTCCAAGA A GCTTTCCTTC CTGAGTTGGG GCACCAACAA G AACAG ACAG. 1560 

25 AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGCCCT CCCCTTTCAG CCTTCTCAAC TCAGACAGTT CTTGGTACIAA 

SEQ ID NO:278 PBY6 Protein seonence: 
30 Protein Accession #: NPJ49094 

DFILEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAG VEL LMTYFIQLGF VESRFFPPTR 60 
QMGLLFTWYD SLTGVPVSQQ NLLLEKAS VL FNTGALYTQI GTRCDRQTQA GLESAIDAFQ 120 
RAAGVLNYLK DTFTHTPSYD MSPAMLSVLV KMMLAQAQES VFEKISLPGI RNEFFMLVKV 180 

35 AQEAAKVGEV YQQLHAAMSQ APVKENIPYS WASLACVKAH HYAALAHYFT AUXIDHQVK 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EESVREASLC 300 
KKLRSEVLQ KVLCAAQERS RLTYAQHQEE DDLLNLIDAP SWAKTEQEV DDLPQFSKL 360 
TVTDFFQKLG PLS VFSANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCSASVA 420 
GAREGDYIVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKSATYSVG 480 

40 MQKTYSMICL AIDDDDKTDK TKKLSKKLSF LSWGTNKNRQ KSASTLCLPS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 

SEQ ID N0:279 PBY8 DMA SEQUENCE 

45 Nucleic Acid Accession*: AF107493 

Coding sequence: 125-556 {underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

<n 1 1 I I I I 

DU GAATTCGGCA CGAGCCTTGT TGGAGGTTCT GGGGCGCAGA ACCGCTACTG CTGCTTCGQT 60 

CTCTCCTTGG GAAAAAATAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 120 

GACAATGGGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 180 

CATAGACAGG GATGACCGTG ATGAGCGTGA AtCCCGAAGC AGGCGGAGGG ACTCAGATTA 240 

- - CAAAAGATCT AGTGATGATC GGAGGGGTGA TAGATATGAT GACTACCGAG ACTATGACAG 300 

55 TCCAGAGAGA GAGCGTGAAA GAAGGAACAG TGACCGATCC GAAGATGGCT ACCATTCAGA 360 

TGGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 420 

CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 480 

GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 540 

CTTGCTTAGT TCCTGATATT ATTGTTCTCT TCCCCATTCC CACCTCAGTC CCTAAAGAAC 600 

60 ATCCTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTAA 660 

GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CTTTGTTTtfA 720 

GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT CCTTTTTTTT TTTGACTATA 780 

GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 840 

TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 900 

65 TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 960 

TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ATGCTGTCTC TAGGTGTAAG 1020 

CCGTGGTTTC GCCTTCGTGG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 1080 

CAATCAGGTT GCTTCACTCA CCAAGTCTAG ATATTCATGA AAATGGAACA AGTCTGTACA 1140 

„ ATTTTAAAAA AAGGTTGAAG GAGTGGTTTG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 1200 

70 AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 1260 

ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 1320 

AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTGTTATA 1380 

TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 1440 

_ AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAG 1500 

75 ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAGGGAAT ATTTAAGGAC 1560 

TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGGAGTGGT GTCCTGCAGC 1620 

TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATAGTCTGT 1680 

TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 1740 

CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 1800 
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10 
15 
20 
25 
30 



TTATTGAATG CCTTTAACCA AGTGCATTCT GGGAAGTTTG CTTGACTCAT TATCTTGCTT I860 

TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT GAATAAAAQT TACATTCTTT 1920 

GATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA GGGTATCACT TTGACTGACT 1980 

GATTTGTTAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA GAAATCACAG CCTGATTTTA 2040 

TTAAAAGTAG AGCTTCATTC ATTTCATACC ATAGATACCA TCCTAGTAAA TCCAGAACAT 2100 

ATACAAGGTT CATGTGAGTC TGCTTTCTTG ACATGATAGC ATTGTTTGAT GCAGTGGATA 2160 

TGTCAGAATG ACTAACCTAG GAGTTTGAAA CTCCTAAGAA ACTAAAACCT GTAAGACATT 2220 

TAAAAGTCTC CACAATTTTA ATGTATACAA AGCTATGTTA CTGTGTAACA CATTACAGTT 2280 

CAAATTCACT CCAGAAATAA AAGGCCAGTA GGATTAGGGA CTCACTGGTA GTTTGGAGTC 2340 

TCCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

TTTGCTTCTG TATATCACAG TGAGTGGATG GCCCTTCAGC TTTTTCTCTC CTGGCCAGAC 2460 

ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

GGATTTGCAA GAACCAAATT GCTCAACAGT ATGTATGTTT AOAGGGGTTA GACTCCTTTT 2580 

TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA TAGTGTCAAA CCACCCCCAC 2640 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA AAAA 



$EQ ID Np:flQ PgYB prQiein sequence; 
Protein Accession I: XP.003261 

MGSDKRVSRT ERSCRYCS1I DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDOYHSDG DYGEHDYRHD ISDERESKTI MLRGLPIHT ESDIREMMES 120 
FEGPQPADVR LMKRKTGE5L LSS 

SEQ ID NO:281 PCI2 DNA SEQUENCE 

Nuctetc Acid Accession*: AF208291 

Cooing sequence: 109-3705 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I II 1 I I 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGGC TGAGGGGGCC GAGCTCGCGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCC GAT GG CCCCCGTQ 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

35 TTCTGTAGTG TGAAGAAACT AAAAGTAGA6 CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAOAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGGTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480, 

40 ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCACCA CGTCTACTGC CACCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

45 GTAGCCATCA AGATCCTGAA GAACCGCCCA TOCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCATCCTGG CCCGGTTGAG CACGGAGAGT GCCGATGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCCCC TCAAATACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGCCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

50 CTCAAACCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

55 TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

60 CCCTTTGTCA CCATGACACA CTTACTCGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACOCATCTA CACTCTACCA GCCCTCAGCG 1980 

65 GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCCCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTCCTTCCCC CAGCATGGCA GCAACTGACT 2280 

70 GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

75 ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGCCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

80 CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGCG TGCTGGGCAC 2940 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG 
CGAACCATCA TCGTGCCACC CCTGAAAACC 
AGCCTGGTGC CAOTCAACAC CAGTCACCAC 
AACGTGACCT CCACCAGCGG TCACTCTTCA 
CAGCA6CG6C CGGGCCCCCA CTTCCA6CAG 
CAGCACATCA CCACGGACCG CACTGGGAGC 
ACCATGGCCC AGGCTCCGTA CTCCTTCCCG 
CCGCATCTGG CTGCAGCCGC TGCCGCTQCC 
TACACTGCGC CGGCGGCCCT GGGCTCCACC 
GGCTCTGCGC GCCACACCGT GCAGCACACT 
OCCGTGAGCA TGGGCCCCCG GGTCCTGCCC 
GCCCAATTTG CCCACCAGAC CTACATCAGC 
TACCCACTGA GCCCCGCCAA GGTCAACCAG 
GAGGGAGGGA GGGAGGGAGA GAATGGCCCG 
CCTGGGACCG TGGGCGCTGG CCTTTTATAC 
GGGCAGGGGC GGGGGGGGGG GGGGCAGAGG 
CTTGAACCGG GAAGTGGGAG GACGTAOAGC 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA 



AGCCTGGAGA ATCACTGCAC GGGGAACCCC 3000 

CAGGCCAGCG AAGTATTGGT GGAGTGTGAT 3060 

TCGTCCTCCT ACAAGTCCAA GTCCTCCAGC 3120 

GGGAGCTCAT CTGGAGCCAT CACCTACCGG 3180 

CAGCAGCCAC TCAATCTCAG CCAGGCTCAG 3240 

CACCGAAGGC AGCAGGCCTA CATCACTCCC 3300 

CACAACAGCC CCAGCCACGG CACTGTGCAC 3360 

CACCTCCCCA CCCAGCCCCA CCTCTACACC 3420 

6GCACCGTGG CCCACCTGGT GGCCTCGCAA 3480 

GCCTACCCAG CCAGCATCGT CCACCAGGTC 3540 

TCGCCCACCA TOCACCCGAG TCAGTATCCA 3600 

GCCTCGCCAG CCTCCACCGT CTACACTGGA 3660 

TACCCTTACA TATAAACACT GGAGGGGAGG 3720 

AGGGAGGAGG GAGAGAAGGA GGGAGGCGCT 3780 

TGAAGATGCC GCACACAAAC AATGCAAACG 3640 

GCAGGGGGAC GGGTCGGGAC ACCAGTGAAA 3900 

AGAGAAGAGA ACATTTTTAA AAGGAAGGGA 3960 
TTTTAAAAAA 



SEQ ID NO:282 PCI2 Protein sequence: 
Protein Accession*: NPJ)73577 

MAPVYEGMAS HVQVFSPHTL QSS AFCSVKK LKVEPSSNWD MTGYGSHSKV YSQSKNIPPS 60 
QPASTTVSTS LPVPNPSLPY EQTTVFPGST GHIVVTSASS TSVTGQVLGG PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEEIEN TSSVQUEEH PPMIQNNASO ATVATATTST ATSKNSOSNS 180 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQWKCWKRG TNEIVAIKIL KNRPSYARQG 240 
QIEVSELARL STESADDYNF VRAYECFQHK NHTCLVFEML EQNLYDFLKQ NKFSPLPLKY 300 
IRPVLQQVAT ALMKLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDFGSA SHVSKAVCST 360 
YLQSRYYRAP EOLGLPFCE AIDMWSLGCV IAELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLSAGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKEARKYIF NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF IDLLKKMLTI DADKRITPIE TLNHPFVTMT HLLDFPHSTH 540 
VKSCFQNMEI CKRRVNMYDT VNQSKTPFIT HVAPSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATISLANPEV SILNYPSTLY QPSAASMAAV AQRSMPLQTG TAQICARPDP FQQALIVCPP 660 
GPQGLQASPS KHAGYSVRME NAVPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT S VQHATVIPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCBVS SSQAISSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DVASSTTRER QRQTTVIPDT PSPTVSVITI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYSVQQ RAGHNNANAP DTKGSLENHC 960 
TGNPRT1IVP PLKTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
ITYRQQRPGP HFQQQQPLNL SQAQQHITTD RTGSHRRQQA YTTPTMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPASI 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYISASPAST VYTGYPLSPA KVNQYPYI 

SEQ ID K0:263 PBY1 DMA SEQUENCE 

Nucleic Acid Accession*: NM.017700 

Coding sequence: 147-806 (undefined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

1,1,11 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCCAAAG CTTGTCTTTG CCTAATATGG AGCCCAAAGA AGCCACTGGG AAAGAAAACA 180 

TGGTCACCAA GAAAAAGAAT CTGGCCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 . 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TGACAGGACC GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GGAAAAGATC ATTAAGGAGC TGATACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG ACTGATAGGC 540 

TGGATGTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAACCGGC CATGCAAGTA ATTGGAGAAG 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AA GTAAG GCC TTTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCATTCCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT TTCCCCCTGC GAGAATGACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGCC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTGTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCCTAAAA 1320 

CATTGGTGGG GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAGTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATGT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

SEQ ID MV284PBY1 Protein sequence 
Protein Accession #: NPJK0170 

1 11 21 



31 41 51 
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I I I I I I 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 

NQKLQEKMTP QGECSVAETL TPEEEHHMKR MMAKREKIIK ELIGTEKDYL NDLELCVREV 120 

VQPLRNKKTD RLDVDSLFSN IESVHQISAK LLSLLEEATT DVEPANQVIG EVFLQIKGPL 180 
EDIYKIYCYH HDBAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID N0365 PBQ9 DNA SEQUENCE 

Nucleic Add Accession*: X88534 

Coding sequence: 523*2676 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I I I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 

GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 

TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGOTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCOC TGCTTCCATA ATGATTGCAG CGAGTTTGTQ 1260 

AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATQTCT ACAAGGTGGA GACCATTGCG 2100 

ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC OCAACAACTT ACAGATTACT CAAAGACTGT 2400 

CCTGGTTTCG TGTTTACCCC TCGATCAAGO GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 

OCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEQ tD NO:2B6 PBQ9 Protein sequence: 
Protein Accession*: Q02108 

l u 21 31 41 51 

I I I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 

QRKTSRSRVY LHTLAESICK LIFPEFBRLN VALQRTLAKH KIKBSRKSLB REDPEKTIAB 120 

QAVAAGVPVB VIKESLGEEV FKICYEEDEN ILGWGGTLK DFLNSFSTLL KQSSHCQEAG 180 

KRGRLEDASI LCLDKEDDFL HVYYFFPKRT TSLILPGIIK AAAHVLYETE VEVSLMPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTFPFHF MFDKDMTILQ 300 

FGNGIRRLMN RRDFQGKPNF EEYFEILTPK INQTFSGIMT MLNMQPWRV RRWDNSVKKS 360 

SRVMDLKGQM IYIVESSAIL FLGSPCVDRX EDFTGRGLYL SDIPIHMALR DWLIGEQAR 420 

AQDGLKKRLG KLKATLEQAH QALEEEKKKT VDLLCSIFPC EVAQQLWQGQ WOAKKFSNV 480 

TMLFSDIVGF TAICSCCSPL QVITMLNALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 
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10 



ESDTHAVQIA LMALKMMELS DEVMSPHGEP IKMRIGLHSG SVFAGWGVK MPRYCLFGNN 600 
VTLANKFESC SVPRKINVSP TTYRLLKDCP GFVFTPRSRE ELPPKFPSBI PGICHFLDAY- 660 
QQGTNSKPCF QKKDVEDGNA NFLGKASGID 

SEQ 10 NO:287 PF02 DNA SEQUENCE 

Nucleic Acid Accession* NM.000720 

Coding sequence; 1 1 9-6664 (underlined sequence corresponds to start end stop codon) 



1 11 21 31 41 51 

I I I I ! I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT GTTCGTGGAT 120 

15 GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CGOACCACGC 180 

GAACGAGGCA AACTATGCAA GAGGCACCAG ACTTCCTCTT TCTG6TGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

A CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCG 420 

20 ACCTGCCCGC GCCCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

25 TTTTGTTATA GTAATAGTAG GATTGTTTAG TGTAATTTTG GAACAATTAA CCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGOGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

30 TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCCGAA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTGT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

_ _ GCCATGGGTG TATTTTGTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

35 TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

40 GGCCAAGGCG GGGCCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

45 ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTOGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCGC TTCGCTGTTG CTTCTGCTTT TTCTCTTCAT 2160 

- A TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

50 GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGOGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

55 GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGG6AA0 AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

60 GCTCATCAAC CACCACATCT TCAOCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

65 GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

70 TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCOCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGCTTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

75 AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAQTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATC GTAA TCGGCAGCAT 4020 

80 TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TGCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCGTCTTTT 4140 

CCGAGTGATG CGATTGGTGA AGCTTCTCAO CAGGGQGGAA GGCATCCGGA CATTGCTGTG 4200 

GACTTTTATT AAGTCCTTTC AGGCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4260 

_ CTTCATCTAT GCGGTCATTG GCATGCAGAT GTTTGGGAAA GTTGCCATGA GAGATAACAA 4320 

5 CCAGATCAAT AGGAACAATA ACTTCCAGAC GTTTCCCCAG GCGGTGCTGC TGCTCTTCAG 4380 

GTGTGCAACA GGTQAGGCCT GGCAGGAGAT CATGCTGGCC TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGAG TCAGATTACA ACCCCGGGGA GGAGTATACA TGTGGGAGCA ACTTTGCCAT 4500 

TGTCTATTTC ATCAGTTTTT ACATGCTCTG TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCATCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTTGG GGCCTCACCA 4620 

10 TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAGG GAAGGATAAA 4680 

ACACCTTGAT GTGGTCACTC TGCTTCGACG CATCCAGCCT CCCCTGGGGT TTGGGAAGTT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAGATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4800 

CGGGACAGTC ATGTTTAATG CAACCCTGTT TGCTTTGGTT CGAACGGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGG GCTGTGATAA AGAAAATTTG 4920 

15 GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTG ATGATGAGGT 4980 

AACCGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTAGGA AATTCAAGAA 5040 - 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTGCCCTACA 5100 

GGCGGGATTA AGGACACTGC ATGACATTGG GCCAGAAATC CGGCGTGCTA TATCGTGTGA 5160 

TTTGCAAGAT GACGAGCCTG AGGAAACAAA ACGAGAAGAA GAAGATGATG TGTTCAAAAG 5220 

20 AAATGGTGCC CTGCTTGGAA ACCATGTCAA TCATGTTAAT AGTGATAGGA GAGATTCCCT 5280 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGCCTTCAA TTCCACCTGC 5340 

AAGTGATACT GAGAAACCGC TGTTTCCTCC AGCAGGAAAT TCGGTGTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

tf CAATATGTCC AAAGCTGCCC ATGGAAAGCG GCCCAGCATT GGGAACCTTG AGCATGTGTC 5520 

25 TGAAAATGGG CATCATTCTT CCCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC CGCTATTATG AAACTTACAT TAGGTCCGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTTGC CGGGAAGACC CAGAGATACA TGGCTATTTC AGGGACCCCC ACTGCTTGGG 5700 

GGAGCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCTCGC CCACCTGGAG 5760 

CAGGCAAAAC TATGGCTACT ACAGCAGATA CCCAGGCAGA AACATCGACT CTGAGAGGCC 5820 

30 CCGAGGCTAC CATCATCCCC AAGGATTCTT GGAGGACGAT GACTCGCCCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACCGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCQ 6000 

CATCTTCCCC CATCGCACGG CCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

or TGCCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

35 GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA CCGTGCTACA CCCCCCTGAT 6180 

CCAAGTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTGCCGT CCCTGCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTGTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG CGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

40 GTCAGCAACA AAACACQAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCTGCTTA ATGGGAACGT GCGTCCCCGA GCCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT CCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGCGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCCCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGOGCAGG AGAGCCAGGG 6720 

45 GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTTGT ATAAGAGATG TCATGCCTCA AGAAAGCCAT AAACCTGGTA GGAACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCT GCCCTGCCCC ATTGTCCAGA 6960 

- rt TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

50 AGGCATGGCG GCGGGGTGCA GGGGAAAGTT AAAGGTGATG ACGATCATCA CACCTCGTGT 7080 

CGTTACCTCA GCCATCGGTC TAGCATATCA GTCACTGGGC CCAACATATC CATTTTTAAA 7140 
CCCTTTCCCC CAAATACACT GCGTCCTGGT TCCTGTTTAG CTGTTCTGAA ATA 

SEQ ID HO:288 PFD2 Protein sequence: 
55 Protein Accession*: A38198 

1 11 21 31 41 51 

I I I I I I 

MMMMMMMKKM QHQRQQQADH ANEANYARGT RLPLSGEGPT SQPNSSKQTV LSWQAAIDAA 60 

60 RQAKAAQTMS TSAPPPVGSL SQRKRQQYAK SKKQGNSSNS RPARALFCLS LNNPIRRACI 120 

SIVEWKPFDI FILLAIFANC VALAIYIPFP EDDSNSTNHN LEKVEYAFLI IFTVETFLKI 180 

IAYGLLLHPN AYVRNGWNLL DFVIVIVGLF SVILEQLTKE TEGGNHSSGK SGGFDVKALR 240 

AFRVLRPLRL VSGVPSLQW LNSIIKAMVP LLHIALLVLF VIIIYAIIGL ELFIGKMHKT 300 

CFFADSDIVA EEDPAPCAFS GNGRQCTANG TECRSGWVGP NGGITNFDNF AFAMLTVFQC 360 

65 ITMEGWTDVL YWVNDAIGWE WPWVYFV3LI ILGSFFVLNL VLGVLSGEFS KEREKAKARG 420 

DFQKLREKQQ LEEDLKGYLD WITQAEDIDP ENEEEGGEEG KRNTSMPTSE TESVNTENVS 480 

GEGENRGCCG SDWCWWRRRG AAKAGPSGCR RWGQAISKSK LSRRWRRWNR FNRRRCRAAV 540 

KSVTFYWLVI VLVFLNTLTI SSEHYNQPDW LTQIQDIANK VLLALFTCEM LVKHYSLGLQ 600 

AYFVSLPKRF DCFWCGGIT ETILVELEIM SPLGISVFRC VRLLRIFKVT RHWTSLSNLV 660 

70 ASLLNSMKSI ASLLLLLFLF IIIFSLLGMQ LFGGKFNFDE TQTKRSTFDN FPQALLTVFQ 720 

ILTGEDWNAV MYDGIMAYGG PSSSGHIVCI YFIILFICGN YILLNVFLAI AVDNLADAES 780 

LNTAQKEEAB EKERKKIARK ESLENKKNNK PEVNQIANSD NKVTIDDYRE EDEDKDPYPP 840 

CDVPVGEEEE EEEEDEPEVP AGPRPRRISE LKMKEKIAPI PEGSAFFILS KTNPIRVGCH 900 

KLINHHIFTN LILVFIMLSS AALAAEDPIR SRSFRNTILG YFDYAFTAIF TVEILLKMTT 960 

75 FGAFLHKGAF CRNYFNLLCM LWGVSLVSF GIQSSAISW KILRVLRVLR PLRAINRAKG 1020 

LKHWQCVFV AIRTIGNIMI VTTLLQFMFA CIGVQLFKGK FYRCTDEAKS NPEECRGLFI 1080 

LYKDGDVDSP WRERIWQNS DFNFDNVLSA MMALFTVSTF EGWPALLYKA IDSNGENIGP 1140 

IYNHRVEISI FFIIYIIIVA FFMMNIFVGF VIVTFQEQGE KEYKNCELDK NQRQCVEYAL 1200 

KARPLRRYIP KNPYQYKFWY WNSSPFEYM MFVLIMLNTL CLAMQHYEQS KMFNDAMDIL 1260 

80 NMVFTGVFTV EMVLKVIAFK PKGYFSDAWN TFDSLIVIGS IIDVALSEAD PTESENVPVP 1320 
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TATPGNSEES NRISITFFRL FRVMRLVKLL SRGEGIRTLL WTFIKSFQAL PYVALLIAML 1380 

FFIYAVTGMQ KFGKVAMRDN NQINRNNNFQ TFPQAVLLLF RCATGEAWQE IMLACLPGKL 1440 

CDPESDYNPG EEYTCGSNFA IVYFISFYML CAFUINLFV AVIMDNFDYL TRDWSILGPH 1500 

HLDEFKRZWS EYDPEAKGRI KHLDWTLLR RIQPPLGFGK LCPHRVACKR LVAMNMPLNS 1560 

DGTVMFNATL FALVRTALKI ECTEGNLEQAN EELRAVIKKI WKKTSMKLLD QWPPAGDDB 1620 

VTVGKFYATF LIQDYFRKFK KRKEQOLVGK YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

DLQDDEPEET KREEEDDVFK RNGALLGNHV NHVNSDRRDS LQQTOTTHRP LHVQRPSIPP 1740 

ASDTEKPLFP PAGNSVCHNH HNHNSIGKQV PTSTNANLNN ANMSKAAHGK RPSIGNLEHV 1800 

SENGHHSSHK HDREPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPEI HGYFRDPHCL 1860 

GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR YPGRNIDSER PROYHHPQGF LEDDDSPVCY 1920 

DSRRSPRRRL LPPTPASHRR SSFNFECLRR QSSQEEVPSS PIFPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS WATPPATPPY RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

RSSWYTDEPD ISYRTFTPAS LTVPSSFRNK NSDKQRSADS LVEAVLISEG LGRYARDPKP 2100 

VSATKHEIAD ACDLTIDEME SAASTLLNGN VRPRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 
EPDPGRDEED LADEMICITT L 

SEQ ID NO:289 0B!6 DMA SEQUENCE 

Nucleic Add Accession*: NM.002812 

Coding sequence: 150-3362 (underlined sequence corresponds to start and step cod on) 



1 11 21 31 41 51 

I I I I I I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCQCCGC GA TGGGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGGCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTC 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA OCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA OGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGOCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTT6 GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACGTCTT 3900 

in CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

1U CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTOTTGTT TTTTTGTTTT 4140 
TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 
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SEO ID NO:29QOBI6 Protein sequence: 
Protein Aocesslon #: NR.002812 



1 11 21 31 41 51 

20 | | | | | | 

HGAARGSPAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAFGP 60 

VHVYWLLDGA PVQDTERRFA QGSSLSFAAV DRLQDSGTFQ CVARDDVTGB EARS ANAS FN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNFTLSIA DESFARWLA PQDWVARYE 240 

EAHFHCQFSA QPPPSLQ.WLF EDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

CIGQGQRGPP IILEATLHLA EIEDMPLFEP RVFTAGSBER VTCLPPKGLP EPSVWWEHAG 360 

VRLPTHGRVY QKGHELVLAN IAESDAGVYT CHAANLAGQR RQDVNITVAT VPSWLKKPQD 420 

SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFEV FKNGTLRINS VEVYDGTWYR 480 

CMSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEFDKEA TVPCSATGRE KPTIKWERAD 540 

GSSLPEWVTD NAGTLHFARV TRDDAGNYTC IASNGPQGQI RAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGKDR ILDPTKLGPR MHIFQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NTKHTEAPLY WDKPVPEES EGPGSPPPYK MIQTIGLSVG AAVAY11AVL 720 

GLMFYCKKRC KAKRLQKQPE GEEPEMECLN GGPLQNGQPS AEIQEEVALT SLGSGPAATN 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV FLAKAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

35 LDFRRELEMF GKLNHANWR LLGLCREAEP HYMVLEYVDL GDLKQFLRIS KSKDEKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QRQVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL RWHSPEAILE GDFSTKSDVW AFGVLMWEVF THGEMPHGGQ ADOEVLADLQ 1020 
AGKARLPQPE GCPSKLYRLM QRCWALSPKD RPSFSEIASA LGDSTVD5KP 



SEO ID NO.-291 AA81 DNA SEQUENCE 

Nucleic Add Accession*: KM.002205 

Coding sequence: 1-3150 (underlined sequences correspond to start and stop cottons) 



l U 21 31 41 51 

! I I I I I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTTG CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATQACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

65 GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGACGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCOCTGGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

70 CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

75 GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTCGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

80 ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT I960 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGCC 2040 

CAQAATGTGG GTGAGGGTGQ CGCCTATGAG GCTGAGCTTC GGQTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAGT AAGCGACTGG CATCCCCGAQ ACCAGCCTCA QAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCOCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCOCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC 7GATGCCTGA 



SEQ ID NQ392 AAB1 Protein seouence: 
Protein Accession*: NP.002196 

1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLLPLLL LLLPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL INLVQGQLOT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAFLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLOPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

VDKAWYRGR PXVSASASLT IFPAMFNPEE RSCSLEGNPV ACINLSFCLN ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIAIiNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAWQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK KUJNSQSEW 780 

SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIXLAILFGI, LLLGliLZYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 



SEQ ID NO:293 LBH4 DHA SEQUENCE 

CA Nucleic Add Accession*: BC001291 

50 Coding sequence: 44-541 (start and stop codons are underlined) 

1 11 21 31 41 51 
« I I I I I I 

JJ GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACXXjCTGGGG ACGATGGCGC TGCTCGCCTT 60 
GCTGCTGOTC OTGOCCCTAC CGCGGGTGTG GACAGACXKX AACCTGACTG CGAGACAACG 120 
AGATCCAGAG GACTCCCAGC GAACGGACGA GGGTGACAAT AGAGTGTGGT GTCATGTTTG 180 
TGAGAGAGAA AACACTTTCG AGTGCCAG AA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GCGGCCGTG A AAATATTTCC ACGTTTTTTC ATGGTTGCGA AGCAGTGCTC 300 

60 CGCTGGTTGT GCAGCGATGG AGAGACGCAA GCCAOAGGAG AAGCGGTTTC TCCTGGAAGA 360 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
ACCTATC AAC TCATCAGTGT TCAAAGAATA TGCTGGG AGC ATGGGTG AGA GCTGTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCCGGOCTCA GCCTGTCIIQ 540 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGG ACTCGCTCCA GACCGTTGTC 600 

65 ACCTGTTGCA TTAAACTTGT TTTCTGTTGA TTACCTCTTG GTTTGACTTC CCAGGGTCTT 660 

GGGATGGGAG AGTGGGG ATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTCAGAG GAAGTCCAGA TCTCCTGAGT AGTGATTTTG GTGACAAGTT TTTCTCTTTG 780 
AAATCAAAOC TTGTAACTCA TTTATTGCTG ATGGCCACTC TTTTOCTTG A CTCCCCTCTG 840 

_ rt CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 

70 TGCTGAGATG CTTCCGACCTTTCAGGTGAC GCAGGAACAC TGGGGGAGTC TGAATGATTG 960 
GGGTGAAGAC ATCCCTGGAG TGAAGGACTC CTCAGCATGG GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CX^TTCCAGT GGTGGAGGCG CTGTGGATGG CTGCTTTTCC 7XAACCTTTC 1080 
CTACCAGATT CCAGG AGGCA G AAGATA ACT AATTGTGTTG AAGAAACTTA GACTTCACCC 1 140 

„ ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 

75 ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTT CACGAAAAAA AAAAAAAAA A AAAAAAAAAA AAAAAAAAAA AAA 
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SEQ IP NPifl4 IPH4 pwttfn swots 
Protein Accession*: AAH01291 



5 1 II 21 31 41 51 
1(1(11 

MALLALLLVV ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAKQCSAGCA AMERPKPEEK RFLLEEPMPF FYLKCCKIRY 120 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 



15 

It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
20 application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 1 . A method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. The method of claim 1, wherein die polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1 , wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1, wherein the polynucleotide is immobilized on 

2 a solid surface, 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated transcript to a level of the prostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17, The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 21 . The method of claim 19, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1-16. 

1 23. The nucleic acid molecule of claim 22, which is labeled. 

1 24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1-16. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29. The antibody of claim 28, further conjugated to an effector component. 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label . 

1 31, The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical, 

1 32. The antibody of claim 29, which is an antibody fragment. 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28, 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. 
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1 40* The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41 . The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38, 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 5 1 . A pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient. 

1 52. The method according to claim 1 ,wherein said biological sample is 

2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16, 

1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is not substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61. The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63 . The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65. The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 
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1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70, A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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